To Vaccinate or Not to Vaccinate: It’s not a Question 🩺

To Vaccinate or Not to Vaccinate: It’s not a Question

Helping Africa

Knowledge

Active

Skills you will learn

Natural Language Processing

Classification

Sentiment Analysis

984 joined

259 active

Info Data Leaderboard

About

The data comes from tweets collected and classified through Crowdbreaks.org [Muller, Martin M., and Marcel Salathe. "Crowdbreaks: Tracking Health Trends Using Public Social Media Data and Crowdsourcing." Frontiers in public health 7 (2019).]. Tweets have been classified as pro-vaccine (1), neutral (0) or anti-vaccine (-1). The tweets have had usernames and web addresses removed.

The objective of this challenge is to develop a machine learning model to assess if a twitter post that is related to vaccinations is positive, neutral, or negative.

How to use Colab on Zindi

How to mount a drive on Colab

Variable definition:

tweet_id: Unique identifier of the tweet
safe_tweet: Text contained in the tweet. Some sensitive information has been removed like usernames and urls
label: Sentiment of the tweet (-1 for negative, 0 for neutral, 1 for positive)
agreement: The tweets were labeled by three people. Agreement indicates the percentage of the three reviewers that agreed on the given label. You may use this column in your training, but agreement data will not be shared for the test set.

Files

Description

Files

Tweets that you must classify using your trained model.

Is a starter notebook to help you make your first submission on this challenge.

Labelled tweets on which to train your model.

Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the ID must be correct. Values in the 'label' column should range between -1 and 1.

Join the largest network for
data scientists and AI builders

About FAQs

Status