💊 Must-Read: Weird competition metric

Makerere Passion Fruit Disease Detection Challenge

Helping Uganda

$1 000 USD

Completed (over 4 years ago)

Skills you will learn

Classification

Computer Vision

916 joined

171 active

Info Data Chat Leaderboard

Start

Aug 20, 21

Nov 21, 21

Reveal

Nov 21, 21

derInformatiker

Weird competition metric

2 Sep 2021, 18:55 · edited 2 minutes later · 8

If I submit a prediction of a model which was trained 30 epochs it will get a worse score than the same model that was only trained with 1 epoch. The less trained model predicted way more uncorrect boxes than the long trained one. So spamming with unprecise bboxes results in a higher score than predicting the fruits correctly. This doesn't make any sence. The metric should show a higher score at more precise predictions and not reward more predictions. @zindi

Discussion 8 answers

alenic

I totally agree with you, it's like that a worse model is better than a more robust one and it can be checked also visualizing the prediction on test set...I have posted a topic about that, because there are strong incoherences in the labeling, and I'm worried about the fairness of the test set...

3 Sep 2021, 07:22

Upvotes 0

ASSAZZIN

Thanks @derInformatiker for bringing Up this Topic .

I also trained my model for 20 epochs and it gives a worse score than another model with 4 epochs .
I noticed also whenever your submission is larger (submission.shape[0]) you'll get a better score .
am I the only who has a weird predictions distribution :

fruit_brownspot 733 fruit_healthy 664 fruit_woodiness 584 ?

3 Sep 2021, 07:58 (edited ~12 hours later)

Upvotes 0

Prometheus

Hate to break this lovely thread, but guys your models are overfitting -_-

Use a validation set to validate your predictions.

I don't why other competitors didn't intervene sooner 😂

5 Sep 2021, 17:22

Upvotes 0

alenic

overfit can be a problem, yep, but the Cross Validation of training set is not so informative...I think that there is some problem on labeling rules, for example what happens if a fruit of class 1 and 3 together? I tried some very stupid rules about scores and I get a better result respect to my CV...I don't know it's my fault (maybe) but this competition is very strange...

replied to Prometheus5 Sep 2021, 18:23

Upvotes 0

Prometheus

Whatever reason you say, the organizers would just chime in "that's how competitions are like" or "that's how data is like". anyways, your goal is to defeat other competitors which wouldn't matter since everyone is on an even playing field. Don't really expect your stuff to be in deployment - or whoever does use it in real-time, is a pretty big idiot.

replied to alenic5 Sep 2021, 19:48

Upvotes 0

alenic

all right...I don't like too much these kind of competition, but if this is the goal I will use my hacker's skills 😂...I'm joking...

replied to Prometheus5 Sep 2021, 20:23 (edited 3 minutes later)

Upvotes 0

alenic

just to be clear, I like a test set that is significative, that judge a better model than other, it's impossible that an efficientdet-d0/d1/d2 trained on trainining set, usinf classifiers and merge with an embeddings is worse than a basic faster rcnn trained on only 1 or 2 epoch, came on! if I show to you 100 inference of test set with an efficientdet-d2 you will say "wow"! and if I show you the same with the faster-rcnn fpn you will see mistakes and many problems...but the score is better with faster-rcnn-fpn!!! I don't know...maybe I'm not good at this 😉

replied to alenic5 Sep 2021, 20:38

Upvotes 0

Shreyaskorti007

if u have already done with pre-processing plz reply......

replied to alenic15 Sep 2021, 18:13

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status