We have 19 events with a total of 20,514 annotated tweets.
Detailed statistics and the dataset are available at: https://github.com/rsuwaileh/IDRISI/tree/main/LMR
The data for every event has been split into training, development, and test sets. These are available in the GitHub repository.
The data is available in JSONL format in the GitHub repository. (Full example). The full training, dev and test files, can be downloaded from here:
https://github.com/rsuwaileh/IDRISI/blob/main/LMR/data/EN/gold-random-json/
Resources
In IDRISI GitHub repository:
-
Dataset: IDRISI-RE in json format
-
Baselines: Baselines: BERT-LMR and CRF-LMR baselines are available.