Centre for Digital Humanities

Events

Data Science: Applied Text Mining (Utrecht Summer School)

Event details

Date:
15 July 2024 - 19 July 2024
Time:
All Day
Venue:
Victor J. Koningsberger building
Budapestlaan 4a-b, Utrecht, 3584 CD

From Monday 15 July to Friday 19 July, Utrecht University’s Faculty of Social and Behavioural Sciences offers an Utrecht Summer School course on applied text mining.

In this course, students will learn how to apply text mining methods on text data and analyse them in a pipeline with machine learning and deep learning algorithms.

Given the rapid rate at which text data are being digitally gathered in many domains of science, there is a growing need for automated tools that can analyze, classify, and interpret these kinds of data. Text mining techniques can be applied to create a structured representation of text, making its content more accessible for researchers. Applications of text mining are everywhere: social media, web search, advertising, emails, customer service, healthcare, marketing, etc.

This course offers an extensive exploration into text mining with Python. The course has a strongly practical hands-on focus, and students will gain experience in using text mining on real data and in interpreting the results. Through lectures and practicals, the students will learn the necessary skills to design, implement, and understand their own text mining pipeline.

The topics in this course include preprocessing text, text classification, topic modeling, word embedding, deep learning models, and responsible text mining.

The course deals with:

  • Reviewing the fundamental approaches to text mining;
  • Understanding and applying current methods for analyzing texts;
  • Defining a text mining pipeline given a practical data science problem;
  • Implementing all steps in a text mining pipeline: feature extraction, feature selection, model learning, model evaluation;
  • Understanding and applying state-of-the-art methods in text mining;
  • Implementing word embedding and advanced deep learning techniques.

The course starts with reviewing basic concepts of text mining and implementing advanced concepts in natural language processing. At the end of the week, participants will have mastered advanced skills of text mining with Python.

Participants should have a basic knowledge and a motivation of scripting and programming in Python.

Deadline for registration is 1 July 2024. More information can be found here.