Very nice work, I must also underline the elegance of the approach. I am speechless, personally I have a lot of trouble with the LGBM model during the competition. And I am pleasantly surprised by the approach.
I like lightgbm - for me it is my working horse. It is verry fast compared to catboost and xgboost. At this competition i noticed that omitting features improves the cv score.Since lightgbm run very quickly, a loop over all features and run the cv without the feature. It als make parameter tuning simpler.
Yes i tried a lot of featurs combinations but most of them worsened the cv value. The 2-way interactions of all products improved the cv value but the 3-way interactions did not bring any improvement.
For example, these features did also not improve my cv value:
The easiest way with numerical features is always to multiply or subtract two or more features. If you have domain knowledge then you can do that in a more structure way. if you want learn more check this site (as a start point) : https://www.kdnuggets.com/tag/feature-engineering
Congrats and thanks for sharing. I somehow managed to finish the competion by using XGBOOST. Was able to submit only 1 solutionm, but learnt a lot. Was not aware about lightbm model. Thanks again for sharing.
Very elegant solution. Feature engineering really good.
Thank you!
Very nice work, I must also underline the elegance of the approach. I am speechless, personally I have a lot of trouble with the LGBM model during the competition. And I am pleasantly surprised by the approach.
I like lightgbm - for me it is my working horse. It is verry fast compared to catboost and xgboost. At this competition i noticed that omitting features improves the cv score.Since lightgbm run very quickly, a loop over all features and run the cv without the feature. It als make parameter tuning simpler.
Have you tried another featurs combinations besides the branch_occupation_code?
Yes i tried a lot of featurs combinations but most of them worsened the cv value. The 2-way interactions of all products improved the cv value but the 3-way interactions did not bring any improvement.
For example, these features did also not improve my cv value:
train['tmember']-train['tuntilmember']
train['tmember']/train['tuntilmember']
train['age']/train['tuntilmember']
hey how you thinks for new feature this is were i not get understanding
The easiest way with numerical features is always to multiply or subtract two or more features. If you have domain knowledge then you can do that in a more structure way. if you want learn more check this site (as a start point) : https://www.kdnuggets.com/tag/feature-engineering
thank you for sharing
Congrats and thanks for sharing. I somehow managed to finish the competion by using XGBOOST. Was able to submit only 1 solutionm, but learnt a lot. Was not aware about lightbm model. Thanks again for sharing.
Really good solution, especially interaction features. Thanks for sharing and congratulations.