Sea Turtle Rescue: Error Detection Challenge
Cash and prizes worth $1,950 USD
Help Local Ocean Conservation clean their sea turtle rescues database
30 November 2018–28 April 2019 23:59
302 data scientists enrolled, 53 on the leaderboard
Next Topic: Cross-Validation
published 13 Mar 2019, 10:05

Did anyone experience huge discrepancy between local cross-validation score and leaderboard score?

I'm doing 5 folds k-fold crossvalidation, here's what I've been getting in terms of CV vs LB scores: 0.029 -> 0.081

0.027 -> 0.067

0,049 -> 0,058

I think you mean 0.019 not 0.049 ?

Short Message is that its hard to trust your local CV since Zindi competitions always have 50/50 on Public leaderboard, if its was say 30% on Public Leaderboard then you would highly trust your Local CV. Take the message for competition purpose only. thanks

No typo there: 0.049-something. Correctly building the target variable is (IMO shouldn't be) the first step in solving the problem.

Finding a proper cross-validation process is the next part of the problem. Hint: K-fold cross-validation might not be the fittest here due to the temporal dimension in the dataset.

One thing i forgot is that there might be no correlation between CV and LB score which might make it difficult to build a model that generalises well