Centre for Digital Humanities

Events

Introduction to web scraping

Event details

Date:
09 April 2024
Time:
13:00 - 16:00
Venue:
Digital Humanities Workspace
Drift 27 (Room 0.32), Utrecht, 3512 BR

Join CDH software engineer Donatas Rasiukevičius and CDH technical support assistant Sander Prins for an introduction workshop into web scraping.

This workshop will introduce you to the tools and terminology related to gathering data from the web, with a primary focus on teaching you the fundamental concepts in web scraping. You will discover the key concepts that form the backbone of web scraping and learn how to employ tools like search engines and AI to help create a script for your specific purpose. This highly interactive workshop welcomes you to make mistakes and encourages learning through experimentation.

Workshop topics

  • Data sources (database, API, feeds, web page)
  • Querying data
  • Parsing data
  • Selecting, structuring, and output

Level

This workshop is designed for absolute beginners.

Preparation

The workshop uses an online environment to run python scripts. To use this service, ensure you have a GitHub account. Additionally, it would be useful to have a free OpenAI account enabling you to use ChatGPT during the session. There is no need for additional installations.

Target audience

Due to our funding, priority will be given to humanities teachers, researchers, and students for this workshop. If you are affiliated with a different faculty or institution but interested in participating, please register to be placed on a waiting list. Notification of available spaces will be sent two weeks before the workshop.


To secure your spot, we encourage you to register as soon as possible, as registrations will be processed on a first-come, first-served basis. If you find yourself unable to attend, we kindly request that you cancel your registration by sending an email to CDH@uu.nl. This will allow us to offer the spot to another interested participant. Thank you for your cooperation.