Can you predict when an insurance policy will lapse in Zimbabwe?
My Solution
Kudos to everyone who participated in this challenge as it was a difficult one.

My approach is quite straightforward. create some aggregate features from the policy data, some flags. the major improvement was to train on only policies in the client data and drop policies that lasped in 2017 and 2018.


unqiue counts: policies, products ,principals, family per policy,count of unique mean_premium, frequency encoding of location

mean NPR PREMIUM , minimum NPR PREMIUM per policy, location, agent, type

Flags: flag if policy was effected in 2020 else 0, flag if policy id is present in client data else 0, flag if policy has premium data from 2019 else 0

And Finally, umap dimemsion reduction features using tsne (2 dimensions)

Single Xgboost 5 FOLD: CV 0.2420, Public 0.2419, Private 0.243

