You are here
Histo: event detection and classification for the Digital Humanities
We have created a github repository that contains:
- annotation guidelines designed to detect and classify event mentions in texts;
- a corpus of historical texts annotated with events (span + class) following the previously mentioned guidelines.
Due to space limitations, the following resources are in an external Google Drive folder (https://drive.google.com/open?id=1HVIZpCmei90tE2hMWIyH-b7_hhHUKnmb):
- a set of word embeddings pre-trained on a part of the COHA corpus (https://corpus.byu.edu/coha/) made of texts published between 1860 and 1939 for a total of more than 198 million tokens;
- best models for the automatic detection of events and the joint classification of event extent and type developed with the BiLSTM implementation by Nils Reimers and Iryna Gurevych (https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf)