Primary competition visual

CGIAR Eyes on the Ground Challenge

Helping Africa
$10 000 USD
Completed (over 2 years ago)
Prediction
871 joined
137 active
Starti
Jul 21, 23
Closei
Nov 03, 23
Reveali
Nov 03, 23
Classification vs Regression
Data · 4 Oct 2023, 07:39 · 13

The description of the target value extent says it is in this range of 0-100 with increments of 10%.

## Ancillary data For each growth stage the damage types and their `extent` are provided with the extent given as percentage (%) in 10% increments.

Therefore, from a classification point of view, it seems like we will have 11 classes (0, 10, 20, ... 100). It might appear easier for the model to tackle the problem this way as we are only dealing with eleven values instead of a continuous range (regression) even though labels are discrete.

However, from what I have tried so far, a simple regression model works far better than a simple classification one. I would assume this might be because of the imbalanced nature of the dataset or the test data that contains values outside this train extent range (like in this example given to us).

ID                             extent
L1095F00009C01S00200Rp01978     56
L1095F00009C01S00200Rp09218     48

What do you think?

Discussion 13 answers

Because mse is sensitive to differences between numbers, while crossentropy is not.

4 Oct 2023, 09:14
Upvotes 1
User avatar
Muhamed_Tuo
Inveniam

Hey, It makes sense that the better choice here would be regression. If you take a simple example where your model is struggling on whether an image extent should be 40 or 50 with both having equal probabilities. How do you decide to go with either one ? An obvious solution would be to take the middle ground (being 45). Well, that's natively done with a regression approach.

Someone could go with a classification approach and then use the probabilities to output a single value. Meaning " np.sum(probabilities * labels)", with labels being [0, 10, ... , 100]

Haven't tried the latter, but could be a good compromise

4 Oct 2023, 16:36
Upvotes 0

Hi, thanks for your input on this. Looking at the extent column in the training dataset, it looks like it only contains 10% increments of percentages. So, every instance will have one specific label (either of these [0, 10, ..., 100]).

Therefore, when speaking about the model struggling between 40 and 50, may I ask you the case where you think this phenomenon might happen?

User avatar
Muhamed_Tuo
Inveniam

Yeah, I agree that every instance has 1 specific label. But considering the metric to be RMSE ( and not logloss or any other classification metric) and the fact that it is a gradual 10% increments, make it even even more punishing in case you predict the wrong extent.

I have seen a few instances where the image contains an obvious drought damage, but the extent is 0. In such cases, predicting anything higher than 0 will result in a relatively high penalty.

I also saw a few cases ( a lot actually:) where the extent is very low (let's say 30 ) but the model's estimation of the damage is about 70 (and frankly in some of those cases I believed the model to be right for the simple reason that in these images there wasn't a single healthy weed )

Oh Interesting! you are correct.

User avatar
Nayal_17

are you using all train images for training, or missing out some on the basis of analysis.

User avatar
Muhamed_Tuo
Inveniam

I'm using all images for now. But removing some images "might" help

User avatar
Nayal_17

you got the rmse score of 10, without any post processing using metadata given in dataset?

User avatar
hasan_n

i think all submissions with scores >= 9 can be achieved without any use of the "damage type" column. I doubt that the other higher solutions didnt use the leak though.

take into consideration that you can use the leak to get a high score and hide your real score.

User avatar
Nayal_17

hmm, i too got rmse of 10 without damage type and most probably will get score of 9 too. But i dont't think there is much scope of pushing it further. It's a humble request to all participants who are top 10 to atleast tell us whether they used damage type in their solution in any way.

User avatar
hasan_n

Well the final accepted top 10 solutions are based on the scores on the full test (currenlty this score is computed on only 20-30% of the test set, so a shake-up is expected).

If all these top 10 solutions select the submissions which use the leak, they will all be disqualified. Given the time invested everyone put in this competition, i prefer to not beleive that they would risk it all for nothing.

Anyway, I think its better if the organizers check the top 25-30 solutions instead of 10.

User avatar
Nayal_17

Agree with you on every point, but i am not expecting much shakeup, as cv and lb seems to be correlated and test set seems to be randomly picked from full dataset.

User avatar
Muhamed_Tuo
Inveniam

@Nayal_17 Yeah, my current score is without any postprocessing. Like @hasan_n is saying, you can achieve a score of 9.x without any pp or using the leak.

I wouldn't trust any score lower than that :)