Primary competition visual

Predictive Insights Youth Income Prediction Challenge

Helping South Africa
R10 000 ZAR
Challenge completed ~2 years ago
Prediction
Job Opportunity
637 joined
257 active
Starti
Jun 08, 23
Closei
Oct 01, 23
Reveali
Oct 01, 23
About

The data for this challenge comes from four rounds of a survey of youth in the South African labour market, conducted at 6-month intervals. The survey contains numerical, categorical and free-form text responses. You will also receive additional demographic information such as age and information about school level and results.

Each person in the dataset was surveyed one year prior (the ‘baseline’ data) to the follow-up survey. We are interested in predicting whether a person is employed at the follow-up survey based on their labour market status and other characteristics during the baseline.

The training set consists of one row or observation per individual - information collected at baseline plus only the target outcome (whether they were employed or not) one year later. The test set consists of the data collected at baseline without the target outcome.

The objective of this challenge is to predict whether a young person will be employed, one year after the baseline survey, based on their demographic characteristics, previous and current labour market experience and education outcomes, and to deliver an easy-to-understand and insightful solution to the data team at Predictive Insights.

Files
Description
Files
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
R starter notebook.
Python starter notebook.
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.
Train contains the target. This is the dataset that you will use to train your model.