Primary competition visual

South African COVID-19 Vulnerability Map by #ZindiWeekendz

Helping Africa
$300 USD
Challenge completed over 5 years ago
Prediction
319 joined
177 active
Starti
Apr 03, 20
Closei
Apr 05, 20
Reveali
Apr 05, 20
My Solution
Notebooks · 6 Apr 2020, 06:48 · edited ~11 hours later · 11

Congrat to the winners.

It's an interesting one. Adjusting the prediction did well for me here due to the different distributions between the train/test data, most adjustment are made with the mean of the target in mind but i made mine with the max after dropping the outilier.

SVD did the major work for me in terms of feature engineering.

feat 1: round column 3:end to 2 decimal places + create 5 SVD features

feat 2: round column 3:end to 1 decimal + create 4 SVD features

feat 3: multiply total household with all the percentages + create 4 SVD features

feat 4: target encoding of lln_01,dw_01,psa_00,dw_07,dw_08 after rounding to 2 decimal

in all i ended up with 131 features after dropping feature with no variance.

Single LightGBM 5 FOLD

Before Adjustment LB 3.765 Private 3.769

After: LB 3.710 Private 3.69

https://github.com/horlar1/Zindi-SA-Hack

Looking forward to the Top Solutions

Discussion 11 answers

Thanks a lot for sharing the solutions

6 Apr 2020, 06:57
Upvotes 0
User avatar
marcusinthesky

Thank you so much for sharing. I think sharing solutions is something we should really encourage on Zindi so people can learn and improve.

I ended up trying two solutions, 1. Using Regularized Generalized Linear Models with PCA on Polynomial Features and 2. Using Multivariate Adaptive Regression Splines.

You can find my solution on my Github at: https://github.com/marcusinthesky/Zindi-ZA-COVID19-Vulnerablity-Map

6 Apr 2020, 07:08
Upvotes 0

Thanks a lot. Please find my solution here.

Just a humble try at using Catboost. Only boiler plate code. No feature engineering. Stood at 79th place.

https://anindabitm.github.io/anindadslog/2020/04/06/Zindi_Hack.html

Thanks Holar, great idea to use SVD for featuring engineering given all the correlations.

Can you give a little bit more information about how you scaled the target?

6 Apr 2020, 07:37
Upvotes 0

Thanks.

i adjusted my prediction to look like my target after removing the outlier. target max was 54.8 and mine was 52.

prediction * 1.04 moves my prediction closer to the target max.

Thanks for sharing everyone!

I've uploaded my solution here: https://github.com/Rendiere/zindi-sa-covid-19-vulnerability-hackathon

6 Apr 2020, 09:24
Upvotes 0
User avatar
CapitainData
UM6P

Thanks!

Here, you can find mine! https://colab.research.google.com/drive/1Lv0ecoSoTGun2kFSUSj6lcn2-Ike3XhO

6 Apr 2020, 11:06
Upvotes 0
User avatar
Federal University of Agriculture Abeokuta

Thanks a lot....Your solution is very straightforward and easy to understand

User avatar
msamwelmollel
University of Glasgow

Hi Holar! First of all, thank you for sharing with us your code. Some of us are new in ML, and we certainly need some tweaks and tricks from a guru like you. I would like to ask if you don't mind your source code from data processing to the implementation of the solution. Specifically, I would like to know more about feature engineering and the use of SVD to generate the features.

And for others, sharing the code, please indicates the CV and LB for your solution. Thank you all.

6 Apr 2020, 14:42
Upvotes 0
User avatar
Mahmoud_Trigui
Freelance

thanks for sharing :)