Many academic disciplines, spearheaded by, but not limited to the Humanities and Social Sciences, have embraced digital technologies to process large amounts of text data. Text datasets can be used to quantify trends observed in close reading, or, conversely, to pinpoint sources which might be interesting for closer, manual analysis.
While a lot of specialized software packages exist for tasks like syntactic analysis, topic modelling, or collocation highlighting, the Research Software Lab observed a gap when it comes to searching and filtering digitized text corpora, such as newspaper archives, prior to further analysis steps. Existing software to search and filter text corpora, such as Delpher, often focusses on specific datasets, limiting their universal applicability.
I-Analyzer has been developed to bridge this gap. I-Analyzer allows searching and exploring text corpora, visualizing trends, and downloading tables of text and metadata for further analysis. I-Analyzer is open-source software and freely available.
Presently, it offers access to the following corpora:
- Digital Library for Dutch Literature (DBNL)
- Financial reports of Dutch companies
- Dutch Newspapers from the Royal Library: public dataset and full dataset (available upon request)
- Eighteenth Century Collections Online (available for Utrecht University users)
- Jewish Funerary Inscriptions
- Book reviews from Goodreads
- The Guardian-Observer newspaper archives (available for Utrecht University users)
- 19th century UK Periodicals (available for Utrecht University users)
- Dutch court rulings
- Times newspaper archives (available for Utrecht University users)
- Dutch monarchs’ speeches
- Dutch parliamentary debates