Primary competition visual

Digital Africa Plantation Counting Challenge

Helping Côte d'Ivoire
$10 000 USD
Challenge completed over 2 years ago
Prediction
Computer Vision
Object Detection
701 joined
219 active
Starti
Feb 23, 23
Closei
Apr 09, 23
Reveali
Apr 09, 23
Why RMSE with such high label noise?
Data · 10 Mar 2023, 23:54 · 9

Mean Absolute Error would suite the counting goal better.

Some of the training images are clearly mislabeled (35 vs 0 trees) If there are similar large labeling errors in the test set it would ruin the leaderboard with the current metric.

How do you make sure the test set has better labeling quality?

Discussion 9 answers
User avatar
HungryLearner
Khalifa University

@zindi, please address this. There are label errors in the training.... hope the test set is free of such errors???

11 Mar 2023, 07:42
Upvotes 8

Just to make sure label errors are expected in every labeling process. Switching the metric to MAE would reduce the impact of such errors.

Another option would be to double check the test set labels by the organizers but that could take a few hours of manual work. And as I said labeling errors are always expected :)

11 Mar 2023, 09:24
Upvotes 7

An example of very bad labelling :

12 Mar 2023, 20:08
Upvotes 10
User avatar
Koleshjr
Multimedia university of kenya

Can we correct these errors manually or how are we supposed to correct them?

According to this https://zindi.africa/competitions/digital-africa-plantation-counting-challenge/discussions/15368 we're not allowed to do so .But the problem is if the test data contains such large labeling errors it will ruin the leaderboard.

PS : Based on my experiments at least the public LB contains such errors!

based on my visualization check, training set contains a lot of label noises. Hopefully the private dataset is clear from such noises.

User avatar
Amy_Bray
Zindi

After discussing with the host, the challenge timeline and error metric will stay the same.

Good luck for the last push.

31 Mar 2023, 08:43
Upvotes 0
User avatar
HungryLearner
Khalifa University

@amyflorida626,

While labeling error is understandable in training set, my number one question if this is clearly avoided in the test set.

my guess is not, train and test is probably random split. They have similar error scores.