You are here
- Phone: 0461314542
- FBK Povo
Since May 2013 I am the head of the Digital Humanities research group at FBK. I am involved in the SIMPATICO H2020 project, where I coordinate the activities related to automated simplification of Italian documents. Besides, I am involved in the CREEP EIT Digital Project and the HATEMETER project, both related to automated hate speech monitoring and classification. I am also part of several other digital humanities projects, which can be found here.
I got my Master in German Philology and Linguistics at University of Bergamo in 2000.
From 2000 to 2002 I attended a post-degree course (Aufbaustudium) in Computational Linguistics at Centrum fuer Informations- und Sprachverarbeitung, Ludwig-Maximilians-Universitaet, Munich, Germany. I also worked at Sails Labs Munich, an IT company developing machine translation systems.
In 2010 I defended my Phd thesis in "Language Sciences" at Università Ca' Foscari, Venice (advisors: Emanuele Pianta and Prof. Rodolfo Delmonte).
From 2010 to 2013 I held a post-doc position at Fondazione Bruno Kessler and I was involved in the Pescado, Terence and NewsReader European projects. I was also teaching assistant of Computational Linguistics at University of Bolzano, Faculty of Computer Science.
I am currently adjunct professor of Language Interfaces (jointly with Daniele Falavigna) at the Dept. of Psychology and Cognitive Science, University of Trento. In the past, I was adjunct professor of "Computational Linguistics" at University of Trento, Department of Psychology and Cognitive Science, and of "Language Resources and Ontologies" (joint course with Marco Rospocher) at University of Trento, Master in Philosophy and Modernity Languages.
Current Phd Students:
2017 - present: Matteo Lorenzini (co-advised with Marco Rospocher) "Automatic quality improvement and content enrichment of digital cultural heritage data", ICT Doctoral School, University of Trento
2016 - present: Lorenzo Lucchini (co-advised with Bruno Lepri), "Modeling and forecasting cultural dynamics with natural language processed data", ICT Doctoral School, University of Trento
Past Phd students:
2013 - 2018: Stefano Menini, "Automatic Analysis of Agreement and Disagreement in the Political Domain", ICT Doctoral School, University of Trento
2013 - 2018: Rachele Sprugnoli, "Event detection and classification for the Digital Humanities", ICT Doctoral School, University of Trento
Journal articles: [link]
Event and temporal/causal processing:
Rachele Sprugnoli and Sara Tonelli. One, no one and one hundred thousand events: Defining and processing events in an inter-disciplinary perspective. Natural Language Engineering 23(4): 485-506 (2017)
Paramita Mirza and Sara Tonelli. On the contribution of word embeddings to temporal relation classification. In Proceedings of the 26th International Conference on Computational Linguistics (Coling 2016), Osaka, Japan.
Paramita Mirza and Sara Tonelli. CATENA: CAusal and Temporal relation Extraction from NAtural language texts. In Proceedings of the 26th International Conference on Computational Linguistics (Coling 2016), Osaka, Japan. [link to software]
Paramita Mirza and Sara Tonelli. Classifying Temporal Relations with Simple Features. In Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL2014), Gothenburg, Sweden, 2014. [poster] [bib]
Paramita Mirza and Sara Tonelli. An Analysis of Causality between Events and its Relation to Temporal Information. In Proceedings of the 25th International Conference on Computational Linguistics (Coling2014), Dublin, Ireland. [ppt] [bib] [download CausalTimeBank]
Rosella Gennari, Sara Tonelli and Pierpaolo Vittorini. Challenges in Quality of Temporal Data - Starting with Gold Standards. Journal of Data and Information Quality 6(2-3): 9:1-9:4 (2015)
Agreement, Disagreement and Argumentation Mining
Stefano Menini, Elena Cabrio, Sara Tonelli and Serena Villata. Never Retreat, Never Retract: Argumentation Analysis for Political Speeches. In Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI2018), New Orleans, US. [link to data]
Stefano Menini, Federico Nanni, Simone Ponzetto and Sara Tonelli. Topic-based Agreement and Disagreement in US Electoral Manifestos. In Proceedings of Empirical Methods in Natural Language Processing (EMNLP 2017), Copenhagen, Denmark. (honorable mention)
Stefano Menini and Sara Tonelli. Agreement and Disagreement: Comparison of points of view in the political domain. In Proceedings of the 26th International Conference on Computational Linguistics (Coling 2016), Osaka, Japan. [link to datasets]
Digital Humanities, Historical Data Processing, Digital Cultural Heritage
Giovanni Moretti, Rachele Sprugnoli, Stefano Menini, Sara Tonelli: ALCIDE: Extracting and visualising content from large document collections to support humanities studies. Knowledge-Based Systems 111: 100-112 (2016). [link to demo]
Stefano Menini, Rachele Sprugnoli, Giovanni Moretti, Enrico Bignotti, Sara Tonelli, Bruno Lepri. RAMBLE ON: Tracing Movements of Popular Historical Figures. EACL (Software demonstrations) 2017: 77-80 [link to system] [link to system version working with data on Italian Shoah]
Alessio Palmero Aprosio and Sara Tonelli. Recognizing Biographical Sections in Wikipedia (short paper). In Proceedings of the International Conference on Empirical Methods in Natural Language Processing (EMNLP2015), Lisbon, Portugal, 2015. [Link to software]
Rachele Sprugnoli, Sara Tonelli, Alessandro Marchetti, Giovanni Moretti. Towards sentiment analysis for historical texts. Digital Scholarship in the Humanities 31(4): 762-772 (2016).
Rachele Sprugnoli, Tommaso Caselli, Sara Tonelli, Giovanni Moretti. The Content Types Dataset: a New Resource to Explore Semantic and Functional Characteristics of Texts. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2017): 260-266 [link to data]
Alessio Palmero Aprosio, Sara Tonelli, Stefano Menini, Giovanni Moretti. Using Semantic Linking to Understand Persons' Networks Extracted from Text. Frontiers in Digital Humanities, vol. 4 (2017) [link to data]
Mauro Dragoni, Sara Tonelli, Giovanni Moretti. A Knowledge Management Architecture for Digital Cultural Heritage. Journal on Computing and Cultural Heritage, 10(3): 15:1-15:18 (2017).
Mauro Dragoni, Serena Villata, Sara Tonelli and Elena Cabrio. Enriching a Small Artwork Collection through Semantic Linking. In Proceedings of the 13th Extended Semantic Web Conference (ESWC2016), Crete, Grece.
Carolina Scarton, Alessio Palmero Aprosio, Sara Tonelli, Tamara Martin-Wanton, Lucia Specia. MUSST: A Multilingual Syntactic Simplification Tool. IJCNLP (System Demonstrations) 2017: 25-28
Gianni Barlacchi, Sara Tonelli: ERNESTA: A Sentence Simplification Tool for Children's Stories in Italian. CICLing (2) 2013: 476-487
Sara Tonelli, Alessio Palmero Aprosio, Francesca Saltori. SIMPITIKI: a Simplification corpus for Italian. Proceedings of the Third Italian Conference on Computational Linguistics, 2016. [link to dataset]
Sara Tonelli, Alessio Palmero Aprosio, Marco Mazzon. The Impact of Phrases on Italian Lexical Simplification. Proceedings of the Fourth Italian Conference on Computational Linguistics, 2017.
FrameNet, Keyword Extraction and Terminology
Marco Fossati, Claudio Giuliano and Sara Tonelli. Outsourcing FrameNet to the Crowd (short paper). In Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL), Sofia, Bulgaria, 2013.
Sara Tonelli, Claudio Giuliano, Kateryna Tymoshenko. Wikipedia-based WSD for multilingual frame annotation. Artificial Intelligence. 194: 203-221 (2013).
Sara Tonelli, Claudio Giuliano. Wikipedia as frame information repository. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2009), Singapore, 2009.
Sara Tonelli, Daniele Pighin. New features for FrameNet - WordNet mapping. In Proceedings of the Thirteenth Conference on Computational Language Learning (CoNLL), Boulder, Colorado, 2009.
Sara Tonelli. Semi-automatic Techniques for Extending the FrameNet Lexical Database to New Languages. PhD thesis, Dept. of Language Sciences, Università Ca’ Foscari, Venezia, 2010. [link to Italian FrameNet data]
Volha Bryl, Sara Tonelli, Claudio Giuliano, Luciano Serafini. A novel Framenet-based resource for the semantic web. In Proceedings of the 27th Annual Symposium on Applied Computing, pp. 360-365. [link to data]
Marco Rospocher, Sara Tonelli, Luciano Serafini, Emanuele Pianta. Corpus-based terminological evaluation of ontologies. Applied Ontology 7(4): 429-448 (2012).
Giovanni Moretti, Rachele Sprugnoli, Sara Tonelli. Digging in the Dirt: Extracting Keyphrases from Texts with KD. In Proceedings of the Second Italian Conference on Computational Linguistics (CLiC-it 2015), Trento, Italy [link to demo and software].
Mihael Arcan, Marco Turchi, Sara Tonelli and Paul Buitelaar. Leveraging bilingual terminology to improve machine translation in a CAT environment. Natural Language Engineering 23(5): 763-788 (2017)
Discourse Analysis and Parsing
Sucheta Ghosh, Richard Johansson, Giuseppe Riccardi and Sara Tonelli. Shallow Discourse Parsing with Conditional Random Fields. In Proceedings of the 5th International Joint Conference on Natural Language Processing (IJCNLP-2011), Chiang Mai, Thailand, 2011.
Sara Tonelli and Elena Cabrio. Hunting for Entailing Pairs in the Penn Discourse Treebank. In Proceedings of the 24th International Conference on Computational Linguistics (Coling2012) , Mumbay, India, 2012.
Sara Tonelli, Giuseppe Riccardi, Rashmi Prasad and Aravind K. Joshi. Annotation of Discourse Relations for Conversational Spoken Dialogs. In Proceedings of LREC 2010.
Sucheta Ghosh, Sara Tonelli, Giuseppe Riccardi, Richard Johansson: End-to-End Discourse Parser Evaluation. In Proceedings of ICSC 2011: 169-172