Two papers to be presented at LREC 2026

We will present two papers at the Fifteenth biennial Language Resources and Evaluation Conference (LREC), which will be held at the Palau de Congressos de Palma in Mallorca, Spain, on 11-16 May 2026.

Sebastiano Vecellio Salto, Camilla Casula, Alessio Palmero Aprosio and Sara Tonelli. “University Speaking for Everyone: Assessing Changes in Italian Higher Education Statutes Toward Gender-Inclusive Language”

Abstract:

We examine the editorial evolution of Italian university statutes toward inclusive language, analyzing how institutions represent female and non-binary identities and how these representations affect administrative communication. To this end, we compile and annotate a corpus of university statutes, tracing the changes that have led some universities to move from the use of the generic masculine to more inclusive formulations. We also experiment with tools for the automatic detection of non-inclusive language in institutional communication and methods for the automatic rewriting of texts into inclusive language.

Katarina Laken, Erik Bran Marino, Paloma Piot, Davide Bassi, Søren Fomsgaard, Michele Maggini, Renata Vieira, Marcos Garcia and Sara Tonelli. “MuteCods: A Multilingual Telegram Dataset with Benchmark Models for Conspiracy Theory Detection”

Abstract:

The proliferation of conspiracy theories and hateful messages on social media poses significant challenges for content moderation and public discourse. Despite their societal impact, existing datasets for automated conspiracy detection remain limited in scope and language coverage. We present a multilingual dataset of conspiracy content on Telegram comprising 5,750 messages across English, Dutch, Italian, Spanish and Portuguese from 87 channels documented as disseminating conspiracist and extremist content. Domain experts annotated messages for conspiracist tone, population replacement conspiracy theories, vaccine conspiracies, and hate speech. We extensively report on difficulties and caveats when creating and annotating this type of dataset. We establish classification baselines by evaluating six models in zero-shot fashion and fine-tuning three encoder models, achieving F1 scores up to 0.800 for conspiracist tone, 0.846 for PRCT, 0.843 for vaccine-related conspiracy theories, and 0.734 for hate speech. Inter-annotator agreement was moderate, consistent with the complexity documented in similar annotation tasks.

Recent Posts