First I tried to do some data pre-processing and feature engineering.
Then trained a RF model for feature importance calculating on a subset of my data and then filtered out all those non-imprtant features.
In the end, I trained XGB on my pre-processed data and I got train score of 0.17231 and test score of 0.17228.
But after submission, I got score of 1.383, which is very different from local test, train scores.
Did I miss something?
What was your approach (pre-proc, algorithm, etc)?
thanks
To LB correspondent cv score, use the two seasons as validation i.e train your model on season 1and validate it on season 2 and vice versal.