Two short papers and one demo paper have been accepted at the Conference on Recent Advances in Natural Language Processing (RANLP 2017). All are authored by Amosse Edouard, Elena Cabrio, Sara Tonelli and Nhan Le-Than, as the outcome of a collaboration between our group, the Wimmics Research team at INRIA and the University of Nice Sophia Antipolis.

Title: “Graph-based Event Extraction from Twitter”

Abstract:  Detecting which tweets describe a specific event and clustering them is one of the main challenging tasks related to Social Media currently addressed in the NLP community. Existing approaches have mainly focused on detecting spikes in clusters around specific keywords or Named Entities (NE). However, one of the main drawbacks of such approaches is the difficulty in understanding when the same keywords describe different events. In this paper, we propose a novel approach that exploits NE mentions in tweets and their entity context to create a temporal event graph. Then, using simple graph theory techniques and a PageRank-like algorithm, we process the event graphs to detect clusters of tweets describing the same events. Experiments on two gold standard datasets show that our approach achieves state-of-the-art results both in terms of evaluation performances and of quality of the detected events. 

Title: “You’ll Never Tweet Alone: Building Sports Match Timelines from Microblog Posts”

Abstract: In this paper, we propose an approach to building a timeline with salient actions in a sport game based on the tweets posted by users. We combine information provided by external knowledge bases to enrich the content of the tweets and apply graph theory to model relations between actions (e.g. goal, penalties) and participants of a game (e.g. players, teams). We demonstrate the validity of our approach using tweets written during the EURO 2016 Championship and evaluate the output against live summaries produced by sport channels. 

Title: “Building timelines of soccer matches from Twitter” (demo of previous paper)