Primary competition visual

Mobile Money and Financial Inclusion in Tanzania Challenge

Helping Tanzania, United Republic of
$2 250 USD
Challenge completed over 6 years ago
Prediction
703 joined
162 active
Starti
Mar 26, 19
Closei
Jun 30, 19
Reveali
Jul 01, 19
sample_submission clarification
Data · 20 Apr 2019, 21:26 · 3

Hello guys,

Need clarification

no_financial_services | other_only | mm_only | mm_plus

0.5423 | 0.9987 | 0.12 | 0.0123

If the numbers represent probabilies, shouldnt they add up to 1? Or I'm I missing something?

Discussion 3 answers

Had the same thought! I suspect it's just a bad example

21 Apr 2019, 07:58
Upvotes 0

Hi am new in this challenge. Can you please explain the process of submission

23 Apr 2019, 03:05
Upvotes 0

"Your goal is to accurately classify each individual into four mutually exclusive categories..."

So an ideal submission would be 0|1|0|0. But predictions can have uncertainty. And for multi-class classification, the output of a model is often a predicted probability for each class. Depending on the model, these may be calculated independently for each class and thus won't be guaranteed to sum to one.

As an example, scikit-learn has a predict_proba() method for many classifiers. This isn't perfect, especially with some tree-based models. A good post on improving this with more info: https://scikit-learn.org/stable/modules/calibration.html

As for why you may want to submit these predicted probabilities as opposed to just picking the most likely class and submitting a definite prediction (0|0|0|1), consider the goal of optimising score / minimizing loss. In cases where the model is more certain, we want to be as close as possible. In cases with uncertainty, predicting a value closer to 0.5 is a way of hedging your bets - a penalty will be incurred either way, but it will be lower in cases where the model made a wrong prediction.

It would be interesting to compare the two approaches - does someone feel like submitting their predictions as probabilities and then a second submission with the highest prob mapped to 1, the rest to 0?

23 Apr 2019, 08:07
Upvotes 0