Primary competition visual

Uber Movement SANRAL Cape Town Challenge

Helping South Africa
$5 500 USD
Completed (~6 years ago)
Prediction
Anomaly Detection
Forecast
859 joined
133 active
Starti
Oct 11, 19
Closei
Feb 09, 20
Reveali
Feb 10, 20
Public vs. Private LB
Help · 22 Jan 2020, 23:18 · 3

"Note that there is Public and Private Leaderboards. The Public Leaderboard excludes approximately 50% of the test dataset. While the competition is open, the Public Leaderboard will rank the submitted solutions by the accuracy score they achieve. Upon close of the competition, the Private Leaderboard, which covers 100% of the test dataset, will be made public and will constitute the final ranking for the competition."

Just to be clear, does this mean that the final standings are the score on the entire test set i.e. Public AND Private? If yes, why is it done this way? What's to stop people from probing the Public LB for the answers, or otherwise exploit the Public LB? Other Zindi competitions seem to have the same paragraph in the rules. Every other competitive data science platform that I know of bases the final scores only on the Private LB.

Discussion 3 answers

ohh. that is interesting and slightly peculiar. thanks for the headsup.

i guess this approach makes it less limiting that "Your highest-scoring solution will be the one by which you are judged." (taken from paragraph just before the one you were referencing). theoretically, if the public and private where mutually exclusive that couldve been annyoing. one could possibly have been afriad that one is overfitting the LB relative to your best CV model, but be unable to change once prefered submission. less of a risk now.

also likely to lead to less of a shake up when competition ends. whether that is preferable probably depends on your position on the LB. :P

23 Jan 2020, 07:31
Upvotes 0
User avatar
Raheem_Nasirudeen
The polytechnic ibadan

after the competition ends they will release the score on the full test data and from @cobusburger zindi will be the one to choose the best private score from all your submission because sometimes best score public leaderboard may overfit and causes leaderboard shake up. though shake up will still happen 100%

23 Jan 2020, 08:59
Upvotes 0
User avatar
ZINDI

Dear brandenkmurray,

While we endeavor to do a complete Public/Private Leaderboard split of the test data, many of our competition datasets have been limited in size. This is the reality of working on real challenges for various organizations. For this reason, you are right, in many competitions, we’ve used the full test set for the Private Leaderboard. How we do the split is noted in the rules of each competition. In the future, we will aim to do more complete Public/Private splits.

24 Jan 2020, 15:27
Upvotes 0