Primary competition visual

Classification for Landslide Detection

1 000 CHF
Completed (7 months ago)
Classification
Earth Observation
Python
GIS
Computer Vision
Machine Learning
Deep Learning
993 joined
303 active
Starti
Apr 17, 25
Closei
Aug 04, 25
Reveali
Aug 04, 25
Huge dissimilarity between train and test set? Great CV score, but poor leaderboard result
Data · 26 Jul 2025, 14:56 · 5

I'm achieving F1-scores above 0.90 on the training set using cross-validation, but my score drops drastically on the public leaderboard after submission.

It makes me strongly suspect a huge dissimilarity between the train and test sets: different distributions, class imbalance, covariate shift, or even label inconsistencies.

Has anyone faced something similar in a competition? How can I detect and mitigate this train-test mismatch?

PS: I’m using stratified K-Fold CV, and I’ve tried techniques like SMOTE, class weighting, threshold tuning per fold, etc. Nothing seems to bridge the gap.

Franchement, j’ai l’impression que le test set vient d’un autre monde 😅

Discussion 5 answers
User avatar
crossentropy
Federal university of Technology, Akure

What's your LB score?

26 Jul 2025, 14:59
Upvotes 0
User avatar
crossentropy
Federal university of Technology, Akure

Have you tried ensembling separate models

User avatar
CodeJoe

No, I think there's a huge LB to CV correlation. Might just be off by some decimals. I think you might be overfitting if you are using a computer vision approach.

26 Jul 2025, 20:46
Upvotes 0

My CV and learderboard score also correlate pretty well.

However, one thing I am almost sure about, is that the class imbalance is even stronger in the test set. In case you assume the ratio between classes to be the same you might run into some misguided results

27 Jul 2025, 10:45
Upvotes 1