Primary competition visual

COVID-19 Tweet Classification Challenge by #ZindiWeekendz

Helping Africa
$300 USD
Completed (almost 6 years ago)
Natural Language Processing
Classification
198 joined
140 active
Starti
May 08, 20
Closei
May 10, 20
Reveali
May 10, 20
About

The objective of this challenge is to develop a machine learning model to assess if a Twitter post is about covid-19 or not.The data used for this challenge has been collected by the Zindi team via Twitter API from tweets over the past year. The are ~7,000 tweets in the train set and ~3,000 in the test set.

Tweets have been classified as covid-19-related (1) or not covid-19-related (0). All tweets have had the following keywords removed:

  • corona
  • coronavirus
  • covid
  • covid19
  • covid-19
  • sarscov2
  • 19

The tweets have also had usernames and web addresses removed to ensure anonymity.

Leave your predictions as probabilities with values between 0 and 1 and do not round them to 0s or 1s.

Files available for download are:

  • Train.csv - Labelled tweets, 1 indicates that the tweet is about covid-19 and 0 indicates that it is not. You will use this to train your model on.
  • Test.csv - Tweets that you must classify using your model.
  • SampleSubmission.csv - is an example of what your submission file should look like. The order of the rows does not matter, but the names of the ID must be correct. Values in the target column should range from 0 to 1.
Files
Description
Files