You are here

Research

So far, Human Language Technologies have not contributed in a substantial way to the development of Humanities even if they have evolved to a point where they can provide Humanities with analytic tools that go beyond text indexing. Some of the relevant technologies are: Named Entities recognition (e.g. identification of names of persons and locations within texts); extraction of semantic relations between entities (e.g. motion relations between persons and locations); temporal processing (i.e. identification of temporal expressions and events and extraction of relations between them); geographical information processing, key-concept extraction; distributional semantic analysis (i.e. quantification and categorization of semantic similarities between linguistic elements), sentiment analysis (i.e. determine the attitude of a writer with respect to some topic, identify the general polarity – positive, negative, neutral - of a text or of a statement).

In recent years, temporal and spatial analysis has taken root in several closely connected areas of the humanities. The extraction of temporal information from text,  e.g. from news, has become a major research topic in Human Language Technologies and has been widely used in related applications such as question answering and automatic summarization systems. Also the study and the exploitation of spatial information in history and literature has received a great deal of attention in recent years, especially thanks to the possibilities offered by geographical information systems (GIS).

Crowd-sourcing techniques have been recently used to gather vast amounts of annotations from non-experts working on-line through specialized platforms such as  Amazon Mechanical Turk (AMT), or Crowdflower. Besides, other kinds of collaborative platforms for transcribing manuscripts, tagging digital content or correcting OCR output have been successfully tested in specific domains, fostering online collaboration among experts in the Humanities.

The complexity of texts is a multi-faceted phenomenon that affects the readability of documents but also contributes to characterizing the style of a writer / speaker. Furthermore, also a 'subjective' component is involved in language complexity, since a document which is understandable by certain readers may be obscure for others. With this respect, the complexity of texts is intertwined with the problem of language use and evolution, and can be investigated also from a diachronical perspective.

Visualization tools and techniques are crucial to the analysis of digital humanities data, especially in case of large amounts of data. Current visualization techniques allow now a better communication of ideas and analysis results than verbal communication. Therefore, the exploration and implementation of novel visualization techniques for displaying processed material in graphical format is an important research topic, helping to mediate a message for different types of audience.