Trigger warning: The data in this competition can contain graphic descriptions of or extensive discussion of abuse, especially sexual abuse or torture.
The data was collected from Twitter using a python library (twint) by Ambassador Lawrence Moruye for the AFD Gender-Based Violence Dataset Collection Challenge.
The objective of this challenge is to create a machine learning algorithm that classifies tweets about GBV into one of five categories: sexual violence, emotional violence, harmful traditional practices, physical violence and economic violence.
Files available for download:
-
Train.csv - contains the target. This is the dataset that you will use to train your model.
-
Test.csv- resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
-
SampleSubmission.csv - shows the submission format for this competition, with the ‘Tweet_ID’ column mirroring that of Test.csv and the type column containing your predictions. The order of the rows does not matter, but the names of the ‘Tweet_ID’ must be correct.