DataDrive2030 Early Learning Predictors Challenge 🎓

DataDrive2030 Early Learning Predictors Challenge

Helping South Africa

$3 000 USD

Completed (~3 years ago)

Skills you will learn

Prediction

1002 joined

336 active

Info Data Chat Leaderboard

Start

Feb 01, 23

Apr 30, 23

Reveal

Apr 30, 23

About

Data from multiple programmes and projects who used the ELOM tools were collated, spanning from 2019-2022. You can view the different data sources and collection methods in a PDF in the download section.

There are 8 665 children in the train and 3 600 in test.

In this competition, we aim to use machine learning techniques to identify factors of early learning programmes that contribute to better learning outcomes in children. While predicting the child’s ELOM score and the top 15 predictors for each child.

The final merged dataset consisted of 12 265 children across 2 217 facilities. Table X below provides a summary of the data included in the meta-dataset. The first column indicates the data source, and the remainder of the columns show the different types of tools or data collected and the number of children we have data for across these sets of variables. An “X” indicates that the data was not collected at all.

How to use Colab on Zindi

How to mount a drive on Colab

Files

Description

Files

Definitions of the variables in test and train.

Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the child_id must be correct.

Train contains the target. This is the dataset that you will use to train your model.

Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.

Information on how the data was collected.

This is a starter notebook to help you make your first submission. If the file open weirdly you can ctrl-S and it will save to your download folder.

Join the largest network for
data scientists and AI builders

About FAQs

Status