Kudos to everyone who participated in this challenge as it was a difficult one.
My approach is quite straightforward. create some aggregate features from the policy data, some flags. the major improvement was to train on only policies in the client data and drop policies that lasped in 2017 and 2018.
##### FEATURES
unqiue counts: policies, products ,principals, family per policy,count of unique mean_premium, frequency encoding of location
mean NPR PREMIUM , minimum NPR PREMIUM per policy, location, agent, type
Flags: flag if policy was effected in 2020 else 0, flag if policy id is present in client data else 0, flag if policy has premium data from 2019 else 0
And Finally, umap dimemsion reduction features using tsne (2 dimensions)
Single Xgboost 5 FOLD: CV 0.2420, Public 0.2419, Private 0.243
https://github.com/horlar1/Zimnat-Insurance-Challenge/blob/master/Solution.R
Wonderful to see R code. Thanks for sharing
Thanks aninda
Congratulations Holar, thanks for sharing !!!
Thanks Nikhil
Nice, Solution Holar. I learned a few things. Thank you for sharing :)
welldone
hello, and good morning. i would need your assistance on some concepts in this challenge. how do i get in touch with you?
linkedin https://www.linkedin.com/in/taiwo-ogundare/