You are here


The present resource is about the automatic identification of English-Italian code-mixing in English historical travel writings about Italy. We release:

This resource is available on our github page:

A paper describing this work is under review.

[1] King, Ben, and Steven P. Abney. "Labeling the Languages of Words in Mixed-Language Documents using Weakly Supervised Methods." In HLT-NAACL, pp. 1110-1119. 2013.

[2] Schulz, Sarah, and Mareike Keller. "Code-switching ubique est-language identification and part-of-speech tagging for historical mixed text." Proc. of LaTeCH (2016).

Resource type: 
Contact us: