Primary competition visual

Makerere Passion Fruit Disease Detection Challenge

Helping Uganda
$1 000 USD
Completed (over 4 years ago)
Classification
Computer Vision
913 joined
171 active
Starti
Aug 20, 21
Closei
Nov 21, 21
Reveali
Nov 21, 21
Systematic incoherent labelling on Train.csv
Data · 31 Aug 2021, 08:59 · 3

Hi, I would like to notify many incoherences on the dataset labeling, sometimes fruits are well visible and are not labeled, and sometimes yes, sometimes hided fruits are labeled and sometimes not, this labeling incoherence is a problem for the detection model...can we tag ourselves these missing labeling (only on the Train.csv)? These incoherences are also present in the test set?

Thanks

Discussion 3 answers
User avatar
Amy_Bray
Zindi

Hi alenic,

Yes, there are inconsistencies but this is real-life data. Think of different ways to address them and consider that these inconsistencies might be carried over to the test.

31 Aug 2021, 09:09
Upvotes 0

Thank you for the reply. In my opinion, handle a noisy dataset is not a problem (training set), but for the test set is a problem just to jusge which algorithm is the best, because to improve the score, you should "learn" how to model the labeling process if the distribution is the same, in other words you should learn the test bias, that for a production solution is not the best imo, but ok, thanks for the information :)

that's the whole challenge - otherwise wouldn't @amyflorida626 used their own baseline with a few tweaks?