Our paper “ALCIDE: Extracting and visualising content from large document collections to support Humanities studies” written by Giovanni Moretti, Rachele Sprugnoli, Stefano Menini and Sara Tonelli has been accepted for publication on Knowledge-Based Systems journal (IF 3.325) and will soon be published. We are particularly happy about this publication because ALCIDE was the first project we started after the group creation and the paper nicely summarises a lot of work around applied NLP and visualisation we have done in the last three years. 

Abstract

The application of research practices and methodologies from the Information and Communication Technologies to Humanities studies is having a great impact on the way humanities research is being conducted. However, although many applications have been developed to automatically analyse document collections from the historical or the literary domain, they often fail to provide a real support to scholars because of their inherent complexity: technical skills are often required to use them and to inspect their output. On the other hand, some systems are more user-friendly, but present basic analyses and are limited to the needs of a specific research community.

In order to overcome the aforementioned limitations, we developed ALCIDE (Analysis of Language and Content In a Digital Environment), a web-based platform designed to assist humanities scholars in navigating and analysing large quantities of textual data such as historical sources and literary works. This suite of tools combines advanced text processing techniques with intuitive visualisations of the output to serve a broad range of research questions, which no other comparable tool can address in a single platform. Textual corpora can be inspected and compared along five semantic dimensions: who, where, when, what and how. Such dimensions in diferent combinations allow targeting many key questions of different humanities disciplines, as shown in the five use cases presented.