Since 2013 I am the head of the Digital Humanities research group at FBK. I am currently involved in the H2020 ODEUROPA project, where I lead the work-package related to olfactory information extraction, and in the H2020 PERCEPTIONS project, related to the online perception of EU and the narratives around migration to EU. Since January 2021 I am also the scientific coordinator of the KID ACTIONS European project, aimed at addressing cyberbullying among children and adolescents through interactive education and gamification. I am also part of several other digital humanities projects, which can be found here.
I have a Phd in Language Sciences from Università Ca' Foscari, Venice. In 2020 I got the national habilitation as Associate Professor (seconda fascia) for the area `Information Systems' 09/H1.
I was involved in the past in several European project: Pescado (FP7 - keyword extraction), Terence (FP7 - event-based text simplification), NewsReader (FP7 - event extraction and semantic role recognition), SIMPATICO (H2020 - text simplification in the administrative domain), HATEMETER (REC - social media monitoring for islamopohobia detection), CREEP (EitDIGITAL - Cyberbullying detection).
I was senior area chair for Language Resources and Evaluation at ACL 2018 and 2019. I have served as area chair for Language Resources and Evaluation at ACL2020 and for the Computational social science and social media track at EMNLP 2020 and NAACL 2021. I was a Senior program committee member at ECAI 2020. In 2020 I was keynote speaker at the Slovenian Digital Technologies and Digital Humanities Annual Conference.
Current Phd Students:
2020 - present: Camilla Casula "Multilingual abusive language detection", ICT Doctoral School, University of Trento
2019 - present: Daniela Trotta (co-advised with Annibale Elia) "Multimodal political communication", Dept. of Linguistics, University of Salerno
2018 - present: Federico Bonetti, "Gamification for Linguistic Annotation", Doctoral School in Cognitive Science, University of Trento
2017 - present: Matteo Lorenzini (co-advised with Marco Rospocher) "Automatic quality improvement and content enrichment of digital cultural heritage data", ICT Doctoral School, University of Trento
Past Phd students:
2016 - 2020: Lorenzo Lucchini (co-advised with Bruno Lepri), "Modeling and forecasting cultural dynamics with natural language processed data", ICT Doctoral School, University of Trento
2013 - 2018: Stefano Menini (now post-doc in the DH group), "Automatic Analysis of Agreement and Disagreement in the Political Domain", ICT Doctoral School, University of Trento
2013 - 2018: Rachele Sprugnoli (now at Università Cattolica del Sacro Cuore, Milan), "Event detection and classification for the Digital Humanities", ICT Doctoral School, University of Trento
2012 - 2016: Paramita Mirza (now at Max Planck Institute for Informatics, Germany) "Extracting temporal and causal relations between events", ICT Doctoral School, University of Trento [arXiv]
Social media analysis and Hate speech detection: CREEP and Hatemeter projects, KID ACTIONS project
Event and temporal/causal processing: Terence and NewsReader European projects, Rachele Sprugnoli's thesis, Paramita Mirza's thesis [link to software, poster, download CausalTimeBank]
Agreement, Disagreement and Argumentation Mining, especially in the political domain: collaboration with INRIA & University of Nice, Stefano Menini's thesis [link to data, link to datasets], collaboration with University of Mannheim (honorable mention at EMNLP 2017 for joint paper)
Digital Humanities, Historical Data Processing, Digital Cultural Heritage: ALCIDE project [link to demo], RAMBLE-ON application [link to generic system and to system version working with data on Italian Shoah], collaboration with Tommaso Caselli from University of Groningen to create the Content Type corpus [link to data], Matteo Lorenzini's thesis work. Current ODEUROPA project.
Text Simplification: Terence and SIMPATICO European projects, MUSST syntactic simplifier [link to software], ERNESTA simplification tool, SIMPITIKI corpus for Italian Simplification [link to dataset]
FrameNet, Keyword Extraction and Terminology: my Phd thesis work [link to Italian FrameNet data], collaboration with Luciano Serafini to create a FrameNet-based resource for the Semantic Web [link to data], Keyphrase Digger extraction tool [link to demo and software].
Corazza, Michele; Menini, Stefano; Cabrio, Elena; Tonelli, Sara; Villata, Serena,A Multilingual Evaluation for Online Hate Speech Detection,in «ACM TRANSACTIONS ON INTERNET TECHNOLOGY»,vol. 20,n. 2,2020, pp. 1-22
Sprugnoli, Rachele; Tonelli, Sara,Novel Event Detection and Classification for Historical Texts,in «COMPUTATIONAL LINGUISTICS»,vol. 45,n. 2,2019, pp. 229-265
Lucchini, Lorenzo; Tonelli, Sara; Lepri, Bruno,Following the footsteps of giants: modeling the mobility of historically notable individuals using Wikipedia,in «EPJ DATA SCIENCE»,vol. 8,n. 36,2019
Menini, S.; Cabrio, E.; Tonelli, S.; Villata, S.,Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18),2018
Menini, Stefano; Nanni, Federico; Simone Paolo Ponzetto, ; Tonelli, Sara,Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics,2017, pp. 2928-2934