Three papers from our group have been accepted at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025) that will take place from November 5th to 9th 2025. See you in Suzhou!
C. Casula, S. Vecellio Salto, E. Leonardelli and S. Tonelli: Job Unfair: An Investigation of Gender and Occupational Bias in Free-Form Text Completion by LLMs
Abstract: Disentangling how gender and occupations are encoded by LLMs is crucial to identify possible biases and prevent harms, especially given the widespread use of LLMs in sensitive domains such as human resources. In this work, we carry out an in-depth investigation of gender and occupational biases in English and Italian as expressed by 9 different LLMs (both base and instruction-tuned). Specifically, we focus on the analysis of sentence completions when LLMs are prompted with job-related sentences including different gender representations. We carry out a manual analysis of 4,500 generated texts over 4 dimensions that can reflect bias, we propose a novel embedding-based method to investigate biases in generated texts and, finally, we carry out a lexical analysis of the model completions. In our qualitative and quantitative evaluation we show that many facets of social bias remain unaccounted for even in aligned models, and LLMs in general still reflect existing gender biases in both languages. Finally, we find that models still struggle with gender-neutral expressions, especially beyond English.
A. Ramponi, M. Rovera, R. Moro and S. Tonelli: Multilingual vs Crosslingual Retrieval of Fact-Checked Claims: A Tale of Two Approaches
Abstract: Retrieval of previously fact-checked claims is a well-established task, whose automation can assist professional fact-checkers in the initial steps of information verification. Previous works have mostly tackled the task monolingually, i.e., having both the input and the retrieved claims in the same language. However, especially for languages with a limited availability of fact-checks and in case of global narratives, such as pandemics, wars, or international politics, it is crucial to be able to retrieve claims across languages. In this work, we examine strategies to improve the multilingual and crosslingual performance, namely selection of negative samples (in the supervised) and re-ranking (in the unsupervised setting). We evaluate all approaches on a dataset containing posts and claims in 47 languages (283 language combinations). We observe that the best results are obtained by using LLM-based re-ranking, followed by fine-tuning with negative examples sampled using a sentence similarity-based strategy. Most importantly, we show that crosslinguality is a setup with its own unique characteristics compared to multilingual setup. Full paper available at this link.
B. Savoldi, A. Ramponi, M. Negri, and L. Bentivogli: Translation in the Hands of Many: Centering Lay Users in Machine Translation Interactions
Abstract: TBD