Primary competition visual

COVID-19 Tweet Classification Challenge

Helping Africa
Knowledge
Active
Natural Language Processing
Classification
704 joined
214 active
About

The objective of this challenge is to develop a machine learning model to assess if a Twitter post is about covid-19 or not.The data used for this challenge has been collected by the Zindi team via Twitter API from tweets over the past year. The are ~7,000 tweets in the train set and ~3,000 in the test set.

Tweets have been classified as covid-19-related (1) or not covid-19-related (0). All tweets have had the following keywords removed:

  • corona
  • coronavirus
  • covid
  • covid19
  • covid-19
  • sarscov2
  • 19

The tweets have also had usernames and web addresses removed to ensure anonymity.

Leave your predictions as probabilities with values between 0 and 1 and do not round them to 0s or 1s.

Files
Description
Files
Train contains the target. This is the dataset that you will use to train your model.
This shows the submission format for this competition, with the โ€˜IDโ€™ column mirroring that of Test.csv and the โ€˜targetโ€™ column containing your predictions. The order of the rows does not matter, but the names of the ID must be correct.
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.