Centre for Digital Humanities

Events

CDH workshop: Automatic Speech Recognition using Whisper

Event details

Date:
24 October 2024
Time:
11:30 - 13:00
Venue:
Digital Humanities Workspace
Drift 27 (Room 0.32), Utrecht, 3512 BR

During this workshop, dr. Arjan van Hessen, researcher in HLT (Human Language Technology), will demonstrate how to apply state-of-the-art automatic speech recognition (ASR) to your own materials using OpenAI’s Whisper, WhisperX, Faster-Whisper and other “whisper add-ons”. The various Whisper packages together form an excellent ASR system, trained on 680.000 hours of multilingual and multitask supervised data, collected from the web.

Automatic Speech Recognition (ASR), which involves the computerized recognition and processing of spoken language, holds significant potential for various disciplines. It can serve as a valuable tool for individuals working with their own and/or “existing” collection(s) of interviews or other “spoken” recordings, such as oral history interviews, interview transcriptions, subtitle materials, and more. With the arrival of Whisper (September 2022), it became possible, for about 100 different languages, to automatically convert the audio into a reasonably correct textual representation.

Of course, the results are not flawless and attention must always be paid to the spelling of names, companies or organizations. Moreover, recognizing people who speak a relatively heavy dialect and recognizing children’s speech is also often less successful. But this is being worked on and that will hopefully show in the results soon.

Finally, if there is enough time, the host will discuss some Open-Source correction tools that can be used to correct the recognition results. Background information and up-to-date information about (the implementation of) Whisper can be found on the web.

Level

No prior experience with Whisper or ASR is required. This workshop is suitable for all levels, from beginner to advanced. The workshop will be in English.

Preparation

In preparation for the workshop, please bring a few short (max. 10 minutes) audio/video “talks” in your own language (and English). During the workshop, you will try to recognize them using Whisper.

Participants should bring their own laptops. Recognition on that computer will probably be slower than desired, but it does show that you can recognize the material on your own computer, without even needing an internet connection.

You are also very welcome to visit the Digital Humanities walk-in hour taking place in the same location (the Digital Humanities Workspace) after this workshop, from 14:00 to 15:00 hrs. No sign-up or preparation is necessary for this walk-in hour, we encourage you to just come by!

For whom?

Due to our funding, priority will be given to teachers, researchers and students of the Faculty of Humanities at Utrecht University for this workshop. If you are affiliated with a different faculty or institution but interested in participating, please register to be placed on a waiting list. Notification of available spaces will be sent two weeks before the workshop.

Registration

Please complete the registration form below if you wish to sign up for this course. Register early to secure your place, as spots are allocated on a first-come, first-served basis.

If you find yourself unable to attend after completing registration, we kindly request that you cancel your registration by sending an email to cdh@uu.nl, allowing us to offer the spot to another participant. Thank you for your cooperation.