✅ Hot Topic: Difference Locations in Train ...

Difference Locations in Train and Test sets

Data · 16 Jun 2025, 21:04 · 7

Locations in the test set are outside the range of training locations; which might cause models that overly rely on lat, lon features or earth observations data to overfit to the training set.

Update: After testing various models, location-dependent approaches consistently outperform others in both CV and LB scores. The best-performing model actively uses location data without overfitting, still note minding this difference between train and test data.

Discussion 7 answers

emnos

This is intentional and expected, You will vary rarely have competition with random sample, in addition the 30/70 Public/Private hold out sample split is not random also, that's why in most cases You will have significant shake up.

16 Jun 2025, 23:09

Upvotes 0

omerym10

Interesting! In this case how do you avoid shake up?

replied to emnos17 Jun 2025, 10:42

Upvotes 0