We have created a github repository that contains:

  • annotation guidelines designed to detect and classify event mentions in texts;
  • a corpus of historical texts annotated with events (span + class) following the previously mentioned guidelines.

Due to space limitations, the following resources are in an external Google Drive folder (https://drive.google.com/open?id=1HVIZpCmei90tE2hMWIyH-b7_hhHUKnmb):

  • a set of word embeddings pre-trained on a part of the COHA corpus (https://corpus.byu.edu/coha/) made of texts published between 1860 and 1939 for a total of more than 198 million tokens;
  • best models for the automatic detection of events and the joint classification of event extent and type developed with the BiLSTM implementation by Nils Reimers and Iryna Gurevych (https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf)

REPOSITORY: https://github.com/dhfbk/Histo

Contacts:

sprugnoli[AT]fbk.eu