The data is a subset of Zindi user activity. All variables have been masked to preserve privacy.
The objective of this competition is to create a machine learning model to determine if a user will be active on Zindi in the next month. An active user is one that enters a competition, makes a submission or engages through the discussion forums. Just imagine, you are one of the data points in this challenge!
-
competitions.csv: this file contains information about hackathons and competitions
-
CompetitionPartipation.csv: this file contains information about users' participation in hackathons and competitions
-
users.csv: this file contains information about the users such as when they registered and which country they are from
-
submissions.csv: this file contains information about each submission made
-
discussions.csv: this file contains information about every discussion made, such as which userID made the submission and when it was created
-
comments.csv: this file contains information about every comment made, such as which userID made the submission and when it was made
-
VariableDefinitions.csv: this file contains information about each table and each variable
-
train.csv - this is a summarized table of the above activities. You can use this table to train you model but it is recommended you pull features from the above tables to enrich your model.
-
test.csv - contains the userID, month and year you need to apply your model to.
-
SampleSubmission.csv - shows the submission format for this competition. The order of the rows does not matter, but the names of the ‘ID’ must be correct.