Primary competition visual

AI4D Malawi News Classification Challenge

Helping Malawi
$2 000 USD
Completed (almost 5 years ago)
Classification
830 joined
322 active
Starti
Jan 22, 21
Closei
May 09, 21
Reveali
May 09, 21
User avatar
Josplay enterprise
Competition Malfunction
Help · 10 May 2021, 15:47 · 7

This is my first ever machine learning competition after making a career transition from 8+ years in software engineering. I was in the 10th position in the Competition and suddenly when it closes I was in 67th position. What happened?

Discussion 7 answers

Generally competitions have a test data hidden from public. This test data is used to evaluate the submissions post competition deadline. This is why at times your ranking can change post competition completion as they evaluate the submissions on this private test data only after the competition is

10 May 2021, 15:56
Upvotes 0
User avatar
Kamenialexnea
Ecole nationale superieure polytechnique yaounde

There are two types of scores as in all data science competitions:

* The public score: the one that allows us to rank during the competition

* The private score: the final score that we must try to optimize because it is with this score that we are ranked.

I let @Zindi confirm, but I think that the test dataset is split in two: a part for the public score and another for the private score

10 May 2021, 15:56
Upvotes 0
User avatar
_MUFASA_

hi @crimacode !

Probably a problem of overfitting. This usually happens when your model is doing well on the training set (and by the way on data that follow the training set distribution), and not that better on the test set. In other words, your model does not generalize well when it comes to unseen data.

Since Zindi scores submissions on two differents sets (one during the competition a.k.a. public lb set and another one after the competition ends a.k.a. private lb set). Obviously/unfortunately, it tends to reorganize the leaderboard (lb) after each competitions.

The aim of most of the competitions is to get a better score on the private lb (getting a better results on the evaluation metric) but since you have only access to the public set, you have to use many tricks (cross validation, early stopping, etc.) to make sure your model is doing a good job.

Anyway keep going on !!!

10 May 2021, 16:06
Upvotes 0
User avatar
MICADEE
LAHASCOM

@drcod Seriously, same thing happened here in my team. Very painful. My team was actually 8th position on private leaderboard provided we selected the best CV which a times very hard to know.

11 May 2021, 10:45
Upvotes 0
User avatar
saheedniyi
University of lagos

I think it's part of the competition too, you should have an intuition on which model will give a better performance during production.

11 May 2021, 19:37
Upvotes 0
User avatar
flamethrower

Yes it's heart breaking at the end of a competition to witness a leaderboard shakeup. Beyond competitions, we choose to deploy a selected model in the real world, there is no LB at test time to help us select best private beyond our training data and how we have partitioned it for validation. We only have our intuition and trust in the developed model performance.

If we get used to Zindi helping us pick submissions, are we really doing good data science?

At the end of the day, competitions should prepare us for to make us develop good intuitions for real world deployment not rely on practices that don't generalize in the real world.

User avatar
MICADEE
LAHASCOM

Yeah.... You're very right. That's just it. No going back on what you've already chosen to be your best CV even though it might be painful a times. This is just a lesson and an additional experience gain in this Data Science Journey. Most important thing in this kind of competition is to learn, learn again and re-learn in order to make a very concrete and valuable decision and this also helps in developing good intuitions for real world deployment as well.