Financial Inclusion in Africa
Can you predict who in Africa is most likely to have a bank account?
1153 active · 3717 enrolled
Good for beginners
Financial Services
Metric Being Used
Notebooks · 1 Sep 2019, 17:57 · 6

Hi There, Since the data is imbalanced, the metric should be used is not the accuracy or error rate, but instead F1-Macro or ROC-AUC. This way we will compete for better models.

Discussion 6 answers

Please zindi kindly look into this. ROC-AUC will be a better evaluation metrics to know how good our model is.

True ROC-AUC would be a better choice if this was a model that had to be put in real-life use, but isn't it a bit late for a metric change?

we can balance the data no ?

balancing the data by oversampling/undersampling or just weighting the classes properly will give a bad score using the error rate metric. If we were being tested on f1 score , balancing would be a good idea.

So what should we do then?

classify data points as they are now. You will miss a lot of '1's compared to their total number in the dataset, and less '0' in %. Balancing data or weighting is a business problem not a data science one. If the cost of missclassifying ones as zeros is bearable for a business, then a model like the ones on the leaderboard right now is viable for production. I hope it's clear enough. If this was a cancer detection challenge, then modelling this way would be wrong.