The paper “Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying” by Stefano Menini, Filippo Oncini, Enrico Maria Piras, Rachele Sprugnoli and Sara Tonelli has been accepted at the 2nd Workshop on Abusive Language Online (ALW2). The workshop will be co-located with EMNLP 2018 in Brussels, Belgium.
Abstract: Although WhatsApp is used by teenagers as one major channel of cyberbullying, such interactions remain invisible due to the app privacy policies that do not allow ex-post data collection. Indeed, most of the information on these phenomena rely on surveys regarding self-reported data. In order to overcome this limitation, we describe in this paper the activities that led to the creation of a WhatsApp dataset to study cyberbullying among Italian students aged 12-13. We present not only the collected chats with annotations about user role and type of offense, but also the living lab created in a collaboration between researchers and schools to monitor and analyse cyberbullying. Finally, we discuss some open issues, dealing with ethical, operational and epistemic aspects.