Primary competition visual

South African COVID-19 Vulnerability Map by #ZindiWeekendz

Helping Africa
$300 USD
Challenge completed over 5 years ago
Prediction
319 joined
177 active
Starti
Apr 03, 20
Closei
Apr 05, 20
Reveali
Apr 05, 20
User avatar
marcusinthesky
RMSE is the theoretically the wrong metric
Help · 5 Apr 2020, 19:36 · 6

RMSE makes the assumption that conditional distribution is symmetric and can range between -infinity and + infinity. Our per cent vulnerable values are bounded between 0 and 1. This should have been like the Deviance on the Beta Distribution or we would have been predicting the log(percent vulnerable). This is going to bias any insights we get from our models. Maybe this is for future reference.

Discussion 6 answers

@marcusinthesky i agree with you RMSE assigns a higher weight to larger errors meaning it is more useful when large errors are present

5 Apr 2020, 20:07
Upvotes 0

Did anyone train with MAE?

5 Apr 2020, 21:16
Upvotes 0

RMSE is the right metrics; if you log the target variable, low RMSE is obtained

5 Apr 2020, 22:31
Upvotes 0

logging values that are zeros and too close to zero doesn't correlate well with lb.I think it will e such better to use log when most values are not too close to zero

User avatar
marcusinthesky

I partially agree with @Engineer, RMSE(log target) would be a better assumption though 1. It is not the current metric and 2. it still suffers from scoring us against distribution which is not the same as our data. @DrFad I did not use MAE as MAE is not the metric used in the competition, MAE is also not differentiable so you cannot optimize directly on it.

6 Apr 2020, 07:03
Upvotes 0