Hi everyone, in the little time I've played around with the data and made submissions I've noticed that the scoring function in the starter notebook gives better results in the LB when the total distance score ranges between 80 to 100 and a little over 100(at least for me). There are times I've scored a total distance of less than 20 using the function but the LB score wasn't pleasing compared with when the score function gave a higher total distance score, which I think should be the opposite(lower score using the function to give better LB score). So I think regardless of the scored total distance you get, you should make a submission since a lower scored distance does not guarantee a higher score on the LB and vice versa.
(If you have similar or different views feel free to comment.)
I have also experienced this. I've scored 32 using the scoring function but got a much worse results on the leaderboard.
Are you using clustering ??
The main difference is the period scored (and the number of crashes, which affects total distance). So testing locally the local scoring is nice for comparison with other local models, but you're right it won't necessarily match the leaderboard.
It would seem to me that you are exploiting patterns in the local data to get a good score, that are not there for the entire population, i.e. overfitting. Have you divided the local data into a train/validate/test data set? You only ever look at the train data set, and use the test data set only to confirm/contradict that the model you chose based on the train data set, actually was a good model for unseen data from the population.