Two papers have been accepted for presentation at the Fourth Italian Conference on Computational Linguistics, to be held in Rome, December 11-13 2017:
Title: “A little bit of bella pianura: Detecting Code-Mixing in Historical English Travel Writing” (PDF)
Authors: Rachele Sprugnoli, Sara Tonelli, Giovanni Moretti and Stefano Menini
Code-mixing is the alternation between two or more languages in the same text. This phenomenon is very relevant in the travel domain, since it can provide new insight in the way foreign cultures are perceived and described to the readers. In this paper, we analyse English-Italian code-mixing in historical English travel writings about Italy. We retrain and compare two existing systems for automatic code-mixing, and analyse the semantic categories mostly connected to Italian. Besides, we release the domain corpus used in our experiments and the output of the extraction.
Title: “The impact of phrases on Italian lexical simplification” (PDF)
Authors: Sara Tonelli, Alessio Palmero Aprosio and Marco Mazzon
Automated lexical simplification has been performed so far focusing only on the replacement of single tokens with single tokens, and this choice has affected both the development of systems and the creation of benchmarks. In this paper, we argue that lexical simplification in real settings should deal both with single and multi-token terms, and present a benchmark created for the task. Besides, we describe how a freely available system can be tuned to cover also the simplification of phrases, and perform an evaluation comparing different experimental settings.