First I tried to do some data pre-processing and feature engineering.
Then trained a RF model for feature importance calculating on a subset of my data and then filtered out all those non-imprtant features.
In the end, I trained XGB on my pre-processed data and I got train score of 0.17231 and test score of 0.17228.
But after submission, I got score of 1.383, which is very different from local test, train scores.
Did I miss something?
What was your approach (pre-proc, algorithm, etc)?
To LB correspondent cv score, use the two seasons as validation i.e train your model on season 1and validate it on season 2 and vice versal.
I haven't looked too much into the competition yet, but this is probably partly a time series problem, if it's the case the standard train-test-split method will perform extremely poorly. J0NNY gave a good solution to the problem in the comment above.
There's also an implementation of time series split in sklearn that might be useful https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.TimeSeriesSplit.html . Anyway, if you google "time series train test split", you can find various explanations on how to deal with this kind of problem.