Since 2013 I am the head of the Digital Humanities research group at FBK. I am currently involved in the H2020 ODEUROPA project, where I lead the work-package related to olfactory information extraction, and in the H2020 PERCEPTIONS project, related to the online perception of EU and the narratives around migration to EU. Since January 2021 I am also the scientific coordinator of the KID ACTIONS European project, aimed at addressing cyberbullying among children and adolescents through interactive education and gamification. I am also part of several other interesting projects, for example PROTECTOR and STAND BY ME, more details here.
I have a Phd in Language Sciences from Università Ca' Foscari, Venice. In 2020 I got the national habilitation as Associate Professor (seconda fascia) for the area `Information Systems' 09/H1. I am also a member of ELLIS, the European Laboratory for Learning and Intelligent Systems, and an appointed Honorary Fellow (cultore della materia) in Computational Linguistics L-LIN/01 at Università di Pavia, Italy. I am currently serving as Liaison Representative of the ACL Special Interest Group on Language Technologies for the Socio-Economic Sciences and Humanities (SIG-HUM) and I am part of the board of the Italian Association for Computational Linguistics (AILC).
I was involved in the past in several European projects: Pescado (FP7 - keyword extraction), Terence (FP7 - event-based text simplification), NewsReader (FP7 - event extraction and semantic role recognition), SIMPATICO (H2020 - text simplification in the administrative domain), HATEMETER (REC - social media monitoring for islamopohobia detection), CREEP (EitDIGITAL - Cyberbullying detection).
In 2022, I was area co-chair for "Digital Humanities and Cultural Heritage" at LREC and area chair for the "Offensive and Non Inclusive Language Detection and Analysis" track at COLING.
Current Phd Students:
2021 - present: Teresa Paccosi, "Extraction of Olfactory Information from Texts", Doctoral School in Cognitive Science, University of Trento
2020 - present: Camilla Casula "Multilingual abusive language detection", ICT Doctoral School, University of Trento
2019 - present: Daniela Trotta (co-advised with Annibale Elia) "Multimodal political communication", Dept. of Linguistics, University of Salerno
Past Phd students:
2018 - 2022: Federico Bonetti, "Gamification for Linguistic Annotation", Doctoral School in Cognitive Science, University of Trento
2017 - 2022: Matteo Lorenzini (co-advised with Marco Rospocher) "Automatic quality improvement and content enrichment of digital cultural heritage data", ICT Doctoral School, University of Trento
2016 - 2020: Lorenzo Lucchini (co-advised with Bruno Lepri), "Modeling and forecasting cultural dynamics with natural language processed data", ICT Doctoral School, University of Trento
2013 - 2018: Stefano Menini, "Automatic Analysis of Agreement and Disagreement in the Political Domain", ICT Doctoral School, University of Trento
2013 - 2018: Rachele Sprugnoli, "Event detection and classification for the Digital Humanities", ICT Doctoral School, University of Trento
2012 - 2016: Paramita Mirza, "Extracting temporal and causal relations between events", ICT Doctoral School, University of Trento [arXiv]
Social media analysis and Hate speech detection: CREEP and Hatemeter projects, KID ACTIONS project, PROTECTOR project, PERCEPTIONS project.
Event and temporal/causal processing: Terence and NewsReader European projects, Rachele Sprugnoli's thesis, Paramita Mirza's thesis [link to software, poster, download CausalTimeBank]
Agreement, Disagreement and Argumentation Mining, especially in the political domain: collaboration with INRIA & University of Nice, Stefano Menini's thesis [link to data, link to datasets], collaboration with University of Mannheim (honorable mention at EMNLP 2017 for joint paper)
Digital Humanities, Historical Data Processing, Digital Cultural Heritage: ALCIDE project [link to demo], project on Epistolario De Gasperi and Edizione Nazionale Aldo Moro, Matteo Lorenzini's thesis work. Current ODEUROPA project.
Text Simplification: Terence and SIMPATICO European projects, MUSST syntactic simplifier [link to software], ERNESTA simplification tool, SIMPITIKI corpus for Italian Simplification [link to dataset]
FrameNet, Keyword Extraction and Terminology: my Phd thesis work [link to Italian FrameNet data], collaboration with Luciano Serafini to create a FrameNet-based resource for the Semantic Web [link to data], Keyphrase Digger extraction tool [link to demo and software].
Corazza, Michele; Menini, Stefano; Cabrio, Elena; Tonelli, Sara; Villata, Serena,A Multilingual Evaluation for Online Hate Speech Detection,in «ACM TRANSACTIONS ON INTERNET TECHNOLOGY»,vol. 20,n. 2,2020, pp. 1-22
Sprugnoli, Rachele; Tonelli, Sara,Novel Event Detection and Classification for Historical Texts,in «COMPUTATIONAL LINGUISTICS»,vol. 45,n. 2,2019, pp. 229-265
Lucchini, Lorenzo; Tonelli, Sara; Lepri, Bruno,Following the footsteps of giants: modeling the mobility of historically notable individuals using Wikipedia,in «EPJ DATA SCIENCE»,vol. 8,n. 36,2019
Menini, S.; Cabrio, E.; Tonelli, S.; Villata, S.,Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18),2018
Menini, Stefano; Nanni, Federico; Simone Paolo Ponzetto, ; Tonelli, Sara,Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics,2017, pp. 2928-2934