Sara Tonelli

Head of unit

    Short bio

    Since 2013 I am the head of the Digital Humanities research group at FBK. I am currently involved in the H2020 ODEUROPA project, where I lead the work-package related to olfactory information extraction, and in the H2020  PERCEPTIONS project, related to the online perception of EU and the narratives around migration to EU. Since  January 2021 I am also the scientific coordinator of the KID ACTIONS European project, aimed at addressing cyberbullying among children and adolescents through interactive education and gamification. I am also part of several other digital humanities projects, which can be found here.

    I have a Phd in Language Sciences from Università Ca' Foscari, Venice. In 2020 I got the national habilitation as Associate Professor (seconda fascia) for the area `Information Systems' 09/H1. I am also a member of ELLIS, the European Laboratory for Learning and Intelligent Systems, and an appointed Honorary Fellow (cultore della materia) in Computational Linguistics L-LIN/01 at Università di Pavia, Italy.

    I was involved in the past in several European project: Pescado (FP7 - keyword extraction), Terence (FP7 - event-based text simplification), NewsReader (FP7 - event extraction and semantic role recognition), SIMPATICO (H2020 - text simplification in the administrative domain), HATEMETER (REC - social media monitoring for islamopohobia detection), CREEP (EitDIGITAL - Cyberbullying detection).

    I have served as area chair for Language Resources and Evaluation at ACL2020 and for the Computational social science and social media track at EMNLP 2020 and NAACL 2021. I was a Senior program committee member at ECAI 2020. In 2021 I was keynote speaker at the 2nd Conference on Computational Humanities Research, at the 5th SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature and at the 3rd Conference on Language Data and Knowledge. In 2022 I am area co-chair for "Digital Humanities and Cultural Heritage" at LREC and area chair for the "Offensive and Non Inclusive Language Detection and Analysis" track at COLING.


    Students' supervision:

    Current Phd Students:

    2021 - present: Teresa Paccosi, "Extraction of Olfactory Information from Texts", Doctoral School in Cognitive Science, University of Trento

    2020 - present: Camilla Casula "Multilingual abusive language detection", ICT Doctoral School, University of Trento

    2019 - present: Daniela Trotta (co-advised with Annibale Elia) "Multimodal political communication", Dept. of Linguistics, University of Salerno


    Past Phd students:

    2018 - 2022: Federico Bonetti, "Gamification for Linguistic Annotation", Doctoral School in Cognitive Science, University of Trento

    2017 - 2022: Matteo Lorenzini (co-advised with Marco Rospocher) "Automatic quality improvement and content enrichment of digital cultural heritage data", ICT Doctoral School, University of Trento

    2016 - 2020: Lorenzo Lucchini (co-advised with Bruno Lepri), "Modeling and forecasting cultural dynamics with natural language processed data",  ICT Doctoral School, University of Trento

    2013 - 2018: Stefano Menini (now post-doc in the DH group), "Automatic Analysis of Agreement and Disagreement in the Political Domain", ICT Doctoral School, University of Trento

    2013 - 2018: Rachele Sprugnoli (now at Università Cattolica del Sacro Cuore, Milan), "Event detection and classification for the Digital Humanities", ICT Doctoral School, University of Trento

    2012 - 2016: Paramita Mirza (now at Max Planck Institute for Informatics, Germany) "Extracting temporal and causal relations between events", ICT Doctoral School, University of Trento [arXiv]

    Research topics

    Social media analysis and Hate speech detectionCREEP and Hatemeter projects, KID ACTIONS project, PROTECTOR project, PERCEPTIONS project.

    Event and temporal/causal processing: Terence and NewsReader European projects, Rachele Sprugnoli's thesis, Paramita Mirza's thesis [link to softwareposter, download CausalTimeBank]

    Agreement, Disagreement and Argumentation Mining, especially in the political domain: collaboration with INRIA & University of Nice, Stefano Menini's thesis [link to datalink to datasets], collaboration with University of Mannheim (honorable mention at EMNLP 2017 for joint paper)

    Digital Humanities, Historical Data Processing, Digital Cultural Heritage: ALCIDE project [link to demo], project on Epistolario De Gasperi and Edizione Nazionale Aldo Moro, Matteo Lorenzini's thesis work. Current ODEUROPA project.

    Text Simplification: Terence and SIMPATICO European projects, MUSST syntactic simplifier [link to software], ERNESTA simplification tool, SIMPITIKI corpus for Italian Simplification [link to dataset]

    FrameNet, Keyword Extraction and Terminology: my Phd thesis work [link to Italian FrameNet data], collaboration with Luciano Serafini to create a FrameNet-based resource for the Semantic Web [link to data], Keyphrase Digger extraction tool [link to demo and software].


    Main publications

    1. Corazza, Michele; Menini, Stefano; Cabrio, Elena; Tonelli, Sara; Villata, Serena,
      A Multilingual Evaluation for Online Hate Speech Detection,
      vol. 20,
      n. 2,
      , pp. 1-
    2. Sprugnoli, Rachele; Tonelli, Sara,
      Novel Event Detection and Classification for Historical Texts,
      vol. 45,
      n. 2,
      , pp. 229-
    3. Lucchini, Lorenzo; Tonelli, Sara; Lepri, Bruno,
      in «EPJ DATA SCIENCE»,
      vol. 8,
      n. 36,
    4. Menini, S.; Cabrio, E.; Tonelli, S.; Villata, S.,
      Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI-18),
    5. Menini, Stefano; Nanni, Federico; Simone Paolo Ponzetto, ; Tonelli, Sara,
      Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing,
      Association for Computational Linguistics,
      , pp. 2928-

    Recent Tweets