Hi, I would like to notify many incoherences on the dataset labeling, sometimes fruits are well visible and are not labeled, and sometimes yes, sometimes hided fruits are labeled and sometimes not, this labeling incoherence is a problem for the detection model...can we tag ourselves these missing labeling (only on the Train.csv)? These incoherences are also present in the test set?
Thanks
Hi alenic,
Yes, there are inconsistencies but this is real-life data. Think of different ways to address them and consider that these inconsistencies might be carried over to the test.
Thank you for the reply. In my opinion, handle a noisy dataset is not a problem (training set), but for the test set is a problem just to jusge which algorithm is the best, because to improve the score, you should "learn" how to model the labeling process if the distribution is the same, in other words you should learn the test bias, that for a production solution is not the best imo, but ok, thanks for the information :)
that's the whole challenge - otherwise wouldn't @amyflorida626 used their own baseline with a few tweaks?