***Corection, I confused the metric with another competiton***
But there's still an issue on predictions made with no technique at all getting better scores than models with technique. What could be the cause?
----------------------------------------------------------------------------------------------
I suspect the reason we get "good" scores when submitting just ZEROs in the whole submission is that the incorrect Evaluation Metric is being used by Zindi. The threshold metric used assumes the data is balanced, and the data in this case is Imbalanced.
"Although widely used, classification accuracy is almost universally inappropriate for imbalanced classification. The reason is, a high accuracy (or low error) is achievable by a no skill model that only predicts the majority class." -
https://machinelearningmastery.com/tour-of-evaluation-metrics-for-imbalanced-classification/
Thanks for the info.
Zindi is using Log_loss not Accuracy.
Just checked, I confused the Metric with the Financial Inclusion Competition.
However, The Scoring Zindi is using here does seem problematic, Whether the submission has all 1s or all 0s it gives the same score, a better score than any model I'm running. That can't be appropriate?
I guess what you focus on is the probability that the classes ur model is predicting is actually correct with a minimal error (log loss).
In this regards, ur score should lie between 0 and 1, i.e, u should be getting something like 0.44, 0.33167, 0.7532. that's just aw it should look
That's not my issue, I'm finding the scoring on Zindi problematic, others are as well if you read the Discussion Board. A model with no technique scores better than a model with technique
I wouldn't say an inappropriate eval metric was used, but i would say an inappropriate test values on Zindi portal mght have been used intentionally or unintentionally.
Perhaps
I tell you even more guys. When I filled submission not with 0 or 1, not even with numbers, but string (i;ve used 'Zindi' :)) I've got the same result! Something wrong.
Totally agree with you, we should do imbalanced classification to solve this problem