Hey Guys, Congrats on the completion of this hack. I was just wondering the best rank team/individual got using ensemble of only machine learning models.
Hi Roman, As for me, I tried interpolating and extrapolating all-time series as this was the only logical thing in this data but it did not give expected results, so I got my best score without imputing missing values. And about feature engineering, my best single model score on private LB is 33.11 with 180 features. And it is ensembled with a model on 3115 features.
@Roman_Lents, for most of it we didn't try to do anything with the NaN values, let the classifier handle them. Manually imputing them increased the RMSE for our
my best ensemble: lgb with local cv 21.6 and a pyTorch with localscore 23.43. Rank 15 in private leaderboard, Rank30+ in public leaderbord. Congratulations to all participants and winners!!.
Hello Krishna_Priya,
Congratulations on your hardwork and performance! Our algorithm is an LGBM + CATBOOST ensemble.
Congrats Nelly, Thank you for informing :)
Congrats! What about feature engineering part and imputing nan values?
I would love to see this too
Hi Roman, As for me, I tried interpolating and extrapolating all-time series as this was the only logical thing in this data but it did not give expected results, so I got my best score without imputing missing values. And about feature engineering, my best single model score on private LB is 33.11 with 180 features. And it is ensembled with a model on 3115 features.
@Roman_Lents, for most of it we didn't try to do anything with the NaN values, let the classifier handle them. Manually imputing them increased the RMSE for our
Catboost, 2 Lgbms - Rank 5
Thank you for telling youngtard, this means most of us have got our scores with ml models only. So features and validation were the key. Cool.
my best ensemble: lgb with local cv 21.6 and a pyTorch with localscore 23.43. Rank 15 in private leaderboard, Rank30+ in public leaderbord. Congratulations to all participants and winners!!.
Bernd that's the best cv I have heard so far. I ask assuming you used KFold. How many folds did you use in that case? By the way great work.
Thank you!!. Normaly i use 5 fold cv. But in this case i used 20 fold (computation was fast because of the smal data set).
Thanks, that explains the reason for such a CV :D
same ensemble and we got rank 7