Hi,
I'm sharing here the magic trick I found to achieve 0.8+ public score and 0.78 private.
To gain a considerable boost, you just need to set all loans to 1 if they are : 1. Ghanain (or better, if loan_type=3) 2. The last loan of the customer_id.
This method consistantly gave me a boost between +0.07 and +0.10.
You can try it on your side and report here your scores 😉
Can I ask you why you choose to set it to 1?
Also do you mind having a private chat about your approach. I struggled to get beyong 0.66 on the LB dispite my cv scores being 0.91
I set it to 1 because last loans are often the ones with default.
To achieve better "normal" scores, I recommend great feature selection and a good CV strategy (groupkfold, grouping by customer_id).
With this, I achieved CV 0.91 and LB 0.70 (0.818 with the magic trick).
Nice magic trick ...
In all the training I did, I always reached a CV of more than 0.93 I can't pass a score of 0.64.
That over fitting is something I don't understand.
Any help please?
It's hard to say without more information. Make sure you have a great CV strategy (groupkfold by customer_id) and no leak in your feature.
what method did you use for feature selection?
What was Ure score before applying this post-processing, I mean the original model score
Hi, congratulations! Just curious, what features did you engineer?
Honestly I don't understand the trick.
Any help please
Wow that's a very thoughtful trick!
This is very insightful. It actually works. 71.9 private and 75.1 public score when I tried it. I would like to know how you came up with this idea. Did you get insights from the data that drove you to take such an initiative?
Based on the conditions stated above, did you update the target column or you created a new column?
I had the intuition that the Ghanean distribution was weird when I saw the high scores on LB. So I tried a few things (I started with just the "last loan" trick, which is already very good on its own), and it worked !
I updated the columns after conducting the predictions and indeed worked like magic😅.
I really appreciate you coming up with your solution. Big ups and congratulations once again.
@CodeJoe can you please explain what and how you implemented it?
I'm finding it difficult to understand the idea here, is it that a new feature was created based on the conditions stated, or was it done after predictions?
Okay, first you predict on the test set. After that, you then update the predictions column with the conditions stated.
Bro! I really don't understand what you saw in the data or what you were thinking to come up with this trick of yours. Kudos though, the trick worked!
Thanks for sharing. How did you come up with setting loans with Ghanain or loan_type==3 to target 1?
Especially for setting loans with loan_type==3 to target 1, it's quite weird but magic!
Indeed!
I had the intuition that the Ghanean distribution was weird when I saw the high scores on LB. So I tried a few things (I started with just the "last loan" trick, which is already very good on its own), and it worked !
Nice intuition!
Thanks for sharing. should we have both the two conditions at the same time to set it to 1 ? What about kenya ?
It doesn't work with Kenya !
When you say last loan, due you mean last loan a customer took or the loan that has the latest due date ?
1st option !
ok, thanks
@VincentSchuler Thank you so much for taking the time to share your insights !
Simply magic! Thanks @VincentSchuler 😁
Man! I'm feeling really dumb right now because I don't understand.
When you say "the Ghanean distribution was weird when I saw the high scores on LB".
Distribution of what? There is no Ghana in the training set to check the amount of 0s and 1s of the target. The test set on the other hand does but there is no target.
I wouldn't have thought of manually altering the model's predictions based on a pattern or rule that the model itself hasn't even "seen", not in a million years. I didn't understand what you saw or what you were thinking. But congratulations! Good job!