DSN AI Bootcamp Qualification Hackathon by Data Science Nigeria

Knowledge

Predict customers who will default on a loan

1095 data scientists enrolled, 750 on the leaderboard

Nigeria

9 September—3 October

25 days

First place approach...

This is the link to my solution:

https://github.com/aifenaike/DSN_KOWOPE

It's just a simple approach with no feature engineering and basically stacking as this was a learning experience for me... I hope it's helps..

Wow. Nice concept bro. Thanks for sharing.

Thanks for sharing your notebook.

Congratulations! and Thanks for sharing your solution.

Great.. so simple approach.. thumb up man.. Congratulations

Thanks

This is a good one bro. I really admire its simplicity for the fact that I was expecting something complex 👏🏿👏🏿👏🏿👏🏿

Thanks a lot for sharing 👍🏽, Congrats on your win. 🏆

Thank you so much for this.

Why the use of linear regression as a meta estimator since it is a classification task.....Pls, kindly answer my only question, I grab the rest

In the case of the meta-model prediction, he was not predicting class but float values based on the class probabilty generated from the sub-models.

I used a classifier as my final estimator... wrong button pressed... oops....

It depends there are some works on kaggle that uses classifiers to stack too ..

It depends there are some works on kaggle that uses classifiers to stack too ..attimes requires additional tuning and we had only 10 submission per day so..

Hi, thanks for sharing your Solution and congratulations! I have a question: In the case of label predictions (multi classification), how do we stack? Can stacking be done using probabilities (using linear regression as meta learner) to predict the actual labels or logistic regression for predicting the actual label? Please if you can shed more idea on how to go about it. Thanks.

Alright @Adebaicy

In my case what I wanted to see was the correlation between prediction from different classifiers and as well obtain a function that would take the predicted values from the submodels and compute a float value(not another probability) what better candidate than regressors!.. beyond that the reason why I chose linear regression is quite simple for simplicity and interpretable results..

Simplicity: I tried using classifiers such as extratrees and Xgboost they failed.. and when you use classifiers it's kind of complex because you still have use predict_proba and still select the probability of defaulting.. predicting probability from probabilities lol..

Interpretable: unlike classifiers which use feature importance(features responsible for splitting at nodes) I wanted to see which model were contributing to the magnitude of the values the stack predicted what better way than the coef_ from linear regression...

Nice concept bro!!! I was expecting something complex.

Love the simplicity.

No feature Engineering.......wawu.....thanks for sharing bro

Simplicity is the key. Thanks for sharing your notebook.

Hi, thanks for sharing your Solution and congratulations! I have a question: In the case of label predictions (multi classification), how do we stack? Can stacking be done using probabilities (using linear regression as meta learner) to predict the actual labels or logistic regression for predicting the actual label? Please if you can shed more idea on how to go about it. Thanks.

.

...

Simple and sweet

Nice solution

congratulations bro and thanks for sharing,

Congratulation 👏. Thanks for sharing your Notebook

I'll like to have your contact bro, this is mine 08124940975 please call me or text me on WhatsApp, thanks and congratulations bro.