My current ranking submission is from a voting strategy based on catboost and lightgbm. Nothing Special in the preprocessing step except from fixing encoding issues on some features, avoiding multicolinearity, deleting features with very low variance, filling missing values , and feature selection using a random forest feature importances.
Funnily, my best scoring model on private leaderboard (0.9265) is a simple random forest. But i didn't consider it, as It was giving 0.91218 on public.
Like many others here, I think, I was torn between trusting my cross validation or increasing my score on zindi. ;) I guess the first option was the right one, haha That's a good lesson for us.
Yes !!! Top 10
Congratulations to the winners
And don't hesitate to let us know which model performed well and which you scored best with.
Congratulations to the winners.
My current ranking submission is from a voting strategy based on catboost and lightgbm. Nothing Special in the preprocessing step except from fixing encoding issues on some features, avoiding multicolinearity, deleting features with very low variance, filling missing values , and feature selection using a random forest feature importances.
Funnily, my best scoring model on private leaderboard (0.9265) is a simple random forest. But i didn't consider it, as It was giving 0.91218 on public.
Like many others here, I think, I was torn between trusting my cross validation or increasing my score on zindi. ;) I guess the first option was the right one, haha That's a good lesson for us.