4 Jun 2020, 11:50

Meet the winners of the #ZindiWeekendz South African COVID-19 Vulnerability Map Challenge

Zindi is excited to introduce the winners of the #ZindiWeekendz South African COVID-19 Vulnerability Map Challenge. In just 60 hours, the virtual hackathon attracted 341 data scientists from across the continent and around the world, with 179 placing on the leaderboard.

The objective of this challenge was to develop a proof-of-concept for how machine learning can help governments more accurately map COVID-19 risk in 2020 using old data, without requiring a new costly, risky, and time-consuming on-the-ground survey.

The task was to predict the percentage of households that fall into a particularly vulnerable bracket - large households who must leave their homes to fetch water - using 2011 South African census data. Solving this challenge will show that with machine learning it is possible to use easy-to-measure stats to identify areas most at risk even in years when census data is not collected.

The winners of this challenge are: Team ColinaCornaCorona from Tunisia in 1st place, Team SomeC from Kenya in 2nd place and Team covid19 sanitizers from Nigeria in 3rd place.

A special thank you to the 3rd place winners for sharing some insights into how they succeeded in this challenge. You can see the winning solutions below.

1st place solution

2nd place solution

3rd place solution

This hackathonhas been re-opened as a knowledge competition, you can join here: #ZindiWeekendz Learning: SOuth African COVID-19 Vulnerability Map

Caleb Emelike (3rd place)

Zindi handle: CalebEmelike (Team covid19sanitizers)

Where are you from? Nigeria

Tell us a bit about yourself?

I'm a graduate from Ambrose Alli University. I work at Jerh and Greys Attorney (J&G Attorney) where I hold the post of network administrator. I started my data science career some months ago.

Tell us about the approach you took.

My solution was a very simple solution, my approach was that I did some feature Engineering on some features, grouping some features into rich and poor people in the ward, also I did a Kmeans clustering on all the features. I also treated the skewed features which are the total individual and total household by taking the log of those features. Then for the modeling part, I used Kfold cross validation to split the data, the number of spits were 14 splits and I used Catboost to model and took their mean.

What were the things that made the difference for you that you think others can learn from?

Trusting my local CV.

What are you looking forward to most about the Zindi community?

Zindi community have been helpful to my data science career. I want them to also add where you share codes and learn from other people codes just like Kaggle.

What are your thoughts on our winners' feedback? Engage via the Discussion page or leave a comment on social media.