Primary competition visual

DataDrive2030 Early Learning Predictors Challenge

Helping South Africa
$3 000 USD
Completed (almost 3 years ago)
Prediction
1001 joined
336 active
Starti
Feb 01, 23
Closei
Apr 30, 23
Reveali
Apr 30, 23
User avatar
Juliuss
Freelance
Private Leaderboard
Platform · 24 Apr 2023, 09:51 · 5

Hello Zindians!

I hope you are all making good progress in the current challenge. I've been waiting to see someone break the RMSE of 8 on the public leaderboard, but it's been a long wait. The top scientists are currently at 9.1 territories, so they are almost there!

I am relatively new to Zindi, but I have participated in a few challenges over the past six months. My first challenge was the Zindi Birthday Challenge, which was tough, and I didn't do very well. Unfortunately, there were some issues with the private leaderboard, and only the top five got their public scores updated: https://zindi.africa/competitions/zindi-new-user-engagement-prediction-challenge/leaderboard. For the rest of us, our public scores were final

This current challenge has been the toughest for me, with a lot of variables, missing values, and mixed data types. I've tried some tricks, but nothing has worked satisfactorily. I've also noticed that some other contestants have raised concerns about the scores not updating correctly: Here https://zindi.africa/competitions/datadrive2030-early-learning-predictors-challenge/discussions/16489 and here My best score is not used as the public score - Zindi. I also noted that the top guy's scores are updating marginally. Could there be an issues with the platform regarding how the scores are updating?

Recently, I received a message from Zindi that they would take the top 20 submissions when the competition closes on April 30th, cross-check them, and release the private leaderboard by May 21st. I'm not sure if this means only the top 20 submissions will receive private leaderboard scores. The other few contests I have joined, its been that private leaderboard is immediately revealed when the competition closes. Then the top contestants' submission are requested for scrutiny before announcing the winners

I'm just trying to familiarize myself with the platform, so I would appreciate some clarification from @Zindi if possible. Thank you

Discussion 5 answers
User avatar
Ecole polytechnique de tunisie

Your model is likely to be overfitting. That’s an issue I have encountered from the very beginning. I certainly believe that you, as well as others, have raised these concerns because you achieved good scores locally, which didn’t align with the LB score. I, too, had some scores between 9.5 and 10 locally, but the LB score was roughly the same, between 10 and 10.15, no matter how good (or bad) my local scores were.

24 Apr 2023, 10:27
Upvotes 2
User avatar
Koleshjr
Multimedia university of kenya

Yeah overfitting is a pretty big challenge in this competition. You can go even past under 8 cv but the lb gets worse. So i don't think there is any problem with leaderboard scoring

User avatar
Juliuss
Freelance

Thank you to my esteemed Zindians for providing clarification. It is now evident where the issues lie, and I appreciate your valuable input. All the best.

User avatar
Raheem_Nasirudeen
The polytechnic ibadan

Kindly note that, cross validation scheme is very important. More important the top 20 submission to be received is how they perform on the private leaderboard which is the best of it should be done.

24 Apr 2023, 11:14
Upvotes 2
User avatar
Juliuss
Freelance

Thank you Raheem for the valuable clarification. I plan to utilize cross-validation techniques in the remaining time available, and I appreciate the guidance that has been provided.