Here's my notebook for a mid-level private score of 0.66. https://www.kaggle.com/code/joshuaamayo/zindi-credit-scoring .Posting for any beginners to follow through my comments and reflections in the markdown, and for the seasoned competitors to give their tips/questions in the comments, as this was my first ever prized competition. Thanks!
It seems that the link is not working. Could you check, please?
maybe it's the final character (".")
Thanks for letting me know! The link is working now.
perfect!
Thanks!
https://www.kaggle.com/code/joshuaamayo/zindi-credit-scoring#Submission
this is the actual link
Interesting. We had a similar approach
I split my notebooks into 5 notebooks: https://github.com/MakalaMabotja/credit-default-prediction/tree/main/experiments
It seems that if you didn't use the Ghana target = 1 trick that's been circulating then you probably built an overfit model with a LB F1 score of around 0.7
in the github, the following notebooks seem not to be working
5-advance_model_building.ipynb
6-final_model.ipynb
could you check, please?
Thanks!
No they're just empty.
I was experimenting with ensemble methods (Voting and Stacking) on top of those already used in models available from XGB & RandomForest (boosting and bagging respectively).
All of this is available in the 3rd notebook. I just haven't cleaned it up yet but I thought I would share my thought process in case it made to some one else other than me.
I still needed to do some more feature engineering (notebook 4) but I ran out of time before getting back to work. After which I wanted to see if I can combine a basic logistic regression model (best at accurate minority class predictions - recall) and tree models (best at majority class predictions - precision) to see if I can't get a more generalized model
Yes, I didn't use a stratified split so it's possible that some customer ids may have leaked to my val set
thank you!!!