📜 Data Talk: RMSE is the theoretically the ...

South African COVID-19 Vulnerability Map by #ZindiWeekendz

Helping Africa

$300 USD

Completed (almost 6 years ago)

Skills you will learn

Prediction

319 joined

177 active

Info Data Chat Leaderboard

Start

Apr 03, 20

Apr 05, 20

Reveal

Apr 05, 20

marcusinthesky

RMSE is the theoretically the wrong metric

Help · 5 Apr 2020, 19:36 · 6

RMSE makes the assumption that conditional distribution is symmetric and can range between -infinity and + infinity. Our per cent vulnerable values are bounded between 0 and 1. This should have been like the Deviance on the Beta Distribution or we would have been predicting the log(percent vulnerable). This is going to bias any insights we get from our models. Maybe this is for future reference.

Discussion 6 answers

Lawrence_Moruye

@marcusinthesky i agree with you RMSE assigns a higher weight to larger errors meaning it is more useful when large errors are present

5 Apr 2020, 20:07

Upvotes 0

Olayinka_Fadahunsi

Did anyone train with MAE?

5 Apr 2020, 21:16

Upvotes 0

Engineer

RMSE is the right metrics; if you log the target variable, low RMSE is obtained

5 Apr 2020, 22:31

Upvotes 0

Lawrence_Moruye

logging values that are zeros and too close to zero doesn't correlate well with lb.I think it will e such better to use log when most values are not too close to zero

replied to Engineer6 Apr 2020, 04:25

Upvotes 0

Engineer

Noted! buddy...

replied to Lawrence_Moruye6 Apr 2020, 20:54

Upvotes 0

marcusinthesky

I partially agree with @Engineer, RMSE(log target) would be a better assumption though 1. It is not the current metric and 2. it still suffers from scoring us against distribution which is not the same as our data. @DrFad I did not use MAE as MAE is not the metric used in the competition, MAE is also not differentiable so you cannot optimize directly on it.

6 Apr 2020, 07:03

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status