Social Media Prediction Challenge 📚

Social Media Prediction Challenge

Helping Africa

$1 000 USD

Completed (~7 years ago)

Skills you will learn

Natural Language Processing

Prediction

312 joined

28 active

Info Data Chat Leaderboard

Start

Sep 04, 18

Nov 25, 18

Reveal

Nov 26, 18

About

The data has been split into a test and training set.

train.json (zipped) is the dataset that you will use to train your model. This dataset includes about 2,400 consecutive tweets from each of the companies listed below, for a total of 96,562 tweets.

test_questions.json (zipped) is the dataset to which you will apply your model to test how well it performs. Use your model and this dataset to predict the number of retweets a tweet will receive. The test set are the consecutive tweets that followed the first tweets provided in the training sets. There are a maximum of 800 tweets per company in this test set. This dataset includes the same fields as train.json except for the retweet_count and favorite_count variables.

sample_submission.csv is a table to provide an example of what your submission file should look like.

Notes on the data: This data was downloaded from Twitter on 23 August 2018. So represents the retweets and favorites at that point in time.

Variables in train.json and test_questions.json are as described in the twitter documentation:

Tweet Object - https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object

User Object - https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/user-object

Entities Object- https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/entities-object

GeoObject - https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/geo-objects

Companies included in this dataset:

Nigeria

Zenith Bank
First Bank Nigeria
Guaranty Trust Bank
Access Bank
Diamond Bank
Ecobank
MTN
Airtel
GloMobile

Ghana

Barclays
Fidelity
Ecobank
Access
Ghana commercial bank
MTN
Vodafone
Airtel Tigo

South Africa

Standard Bank
ABSA-Barclays
FNB
Nedbank
Capitec
Vodacom
MTN
Cell C
Telkom

Kenya

Equity
Kenya Commercial Bank
Co-operative Bank
Standard Chartered
Safaricom
Airtel
Telkom

Uganda

Stanbic
DFCU
Standard Chartered
MTN
Airtel

Files

Description

Files

Train contains the target. This is the dataset that you will use to train your model.

Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.

Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.

Join the largest network for
data scientists and AI builders

About FAQs

Status