🌱 Challenge Chat: Potentially serious problem wi...

Africa Biomass Challenge

Helping Côte d'Ivoire

$10 000 USD

Completed (~3 years ago)

Skills you will learn

Earth Observation

Prediction

1224 joined

275 active

Info Data Chat Leaderboard

Start

Jan 27, 23

May 21, 23

Reveal

May 21, 23

smartstix

Potentially serious problem with the competition.

Data · 24 Apr 2023, 13:28 · 6

TLDR; if you wanted to cheat via data leakage, there is a very easy way of doing so.

Hey @Zindi and all,

Start with your current best model. Now continue to just change your first prediction; if you don't see any difference it is in the held out test set and so move on to the next. Manually change this prediction until its error is zero (or move on if again it is in the held out test set).

In this way, you can find all the true values of the labels at least in the public test set. Because there are only 90 test points, of which only around 30% are in the public test set, you could easily use 300 submissions to do so. (There are of course smarter / more efficient ways of going about this, but you get the gist)

How is @Zindi addressing this? I'm not sure how you could even ban someone from doing this, since changing the value you predict for a test point and seeing what it does to the public leaderboard is just another kind of model debugging technique you could use.

Discussion 6 answers

Moto

Great question! Smart6.

I am not too afraid of this issue as we always overfit the public LB in one way or another. However I am scare of 2 groups.

Group 1: People share ideas and data but not merge. There will be a shakeup in this competition so having "more final submissions" will increase your chance to be on top. 4 people in a team will have less chance to win compared to 4 different teams.

Group 2: People know how to select best submisssions for the private LB. Plato is an excellent example. We need to learn from this group.

24 Apr 2023, 18:19

Upvotes 1

smartstix

I'm not talking about "somewhat" overfitting to the public LB one way or another, I'm saying you could reliably get a perfect score on the leaderboard within 300 submissions, which is much more problematic.

Unless of course @Zindi is fine with this?

replied to Moto24 Apr 2023, 19:56

Upvotes 2

smartstix

Also, the group 1 and 2 argument applies to every competition. My criticism only applies to this particular one.

replied to smartstix24 Apr 2023, 20:04

Upvotes 2

Moto

okie, if a team got a score of ZERO then what?

1. The private set is different from the public set so it is not a big deal.

2. I don't think Zindi allows them to use the labels directly in the code

PS: I knew (Zindi should know) people could probe this LB. In the worst case, 100 subs to know which rows are in the public set. With 200 subs / 30 rows == 7 sub / row, you could easily probe the LB.

replied to smartstix24 Apr 2023, 20:10

Upvotes 0

smartstix

Then we are in agreement :)

replied to Moto24 Apr 2023, 22:04

Upvotes 0

Aleks_Nougbele

I also noticed that. I tried to submit with the test data and the result isn't perfect.

You dont have rerun in zindi platform so the LB Public/private are made by the submission file evaluation only.

If the "perfect submission" isnt good. It could mean that the sub part of the test dataset used for submission is somehow uncoherent or was modified.

The challenge could also be to properly use the test set to figure out the submission part.

29 Apr 2023, 23:46

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status