Primary competition visual

agriBORA Commodity Price Forecasting Challenge

Helping Kenya
€8 250 EUR
14 days left
Data analysis
GIS
Time-series
Forecasting
Nowcasting
704 joined
219 active
Starti
Nov 14, 25
Closei
Dec 27, 25
Reveali
Jan 13, 26
User avatar
CodeJoe
I think I don't understand the LB now
Platform · 9 Dec 2025, 17:50 · 15

This is strange.

Discussion 15 answers
User avatar
Koleshjr
Multimedia university of kenya
9 Dec 2025, 17:55
Upvotes 0
User avatar
CodeJoe

Watch it carefully.

For example @keystats has an RMSE of 1.4545 and an MAE of 1.1750

and I have an RMSE of 3.4738 and an MAE of 2.573709476.

So it means @keystats must be ahead of me on the leaderboard even with the formula from @J0NNY in that chat.

Am I missing something?

User avatar
Koleshjr
Multimedia university of kenya

It is indeed messed up but maybe that is not the best example to use as that is the right behavior. Higher is better . So you should be on top of him but @micadee for example has lower scores and in second . That shouldnt be the case. So it is indeed confusing!

User avatar
CodeJoe

I’m getting higher RMSE and MAE scores than him, which should normally correspond to a lower performance score, right? Conversely, if I had lower RMSE and MAE, I would expect a higher score.

At the moment, the leaderboard seems quite off, which is really confusing. I think I’ll wait for the last two weeks before drawing any conclusions, when the results will be more meaningful.

User avatar
Koleshjr
Multimedia university of kenya

I mean in the traditional sense, Yes but the score you are seeing on the LB is not the actual mae/ rmses . Those have already been normalized.

💻 Introducing Multi-Metric Evaluation, or One Metric to Rule them All

Read the above article. It says:

All metric scores are normalised before being shown on the leaderboard. This ensures fairness when a challenge includes both metrics you want to maximise (such as Accuracy) and metrics you want to minimise (such as Log Loss).

So the score has already been normalized, I guess.

But still that doesn't mean the lb is not messed up , it is but in a mixed way. The calculation is somehow wrong I dont know

User avatar
CodeJoe

I think I get you now. The recent scores are on the new dummy validation scores from the starter notebook?

That would make sense then. I just tested that. I resubmitted an old result and it gave a better score than the previous submission

User avatar
Koleshjr
Multimedia university of kenya

even if we are being evaluated on dummy data , then there should not be a mixup for example. If higher is better then that should be applied consistently Or @AJoel If I was to advise , it is better rescoring the LB and retaining those 2weeks score until the next two weeks(so even for these two weeks we are predicting for week 50 and 51) we still get evaluated based on week 48 and 49 until the next rolling values come out and so on. That way , we won't be evaluated on dummy data. Or better yet(as a competitor), ignore the LB completely and use the published 48 and 49 values already published to check your scores locally.

User avatar
CodeJoe

True, the leaderboard makes our local cross validation difficult to even trust😅

User avatar
keystats
Mount Kenya University

Somehow i disagree with your claim higher is better but i stand to be corrected .... lets say for example we have 3 true values and have two candidates A&B

True values

  • T₁ = 40
  • T₂ = 39
  • T₃ = 38

Candidate A predictions

A₁ = 39.5 A₂ = 38.6 A₃ = 37.4

Step 1 — Compute individual errors

🔹 For T₁ = 40, A₁ = 39.5

Error = 39.5 − 40 = −0.5 Absolute Error = |−0.5| = 0.5 Squared Error = (−0.5)² = 0.25

🔹 For T₂ = 39, A₂ = 38.6

Error = 38.6 − 39 = −0.4 Absolute Error = |−0.4| = 0.4 Squared Error = (−0.4)² = 0.16

🔹 For T₃ = 38, A₃ = 37.4

Error = 37.4 − 38 = −0.6 Absolute Error = |−0.6| = 0.6 Squared Error = (−0.6)² = 0.36

Candidate A summary of individual errors

PointErrorAbs ErrorSq Error1−0.50.50.252−0.40.40.163−0.60.60.36

Step 2 — Calculate MAE for A

MAE = (0.5 + 0.4 + 0.6) / 3 = 1.5 / 3 = 0.5

Step 3 — Calculate RMSE for A

MSE = (0.25 + 0.16 + 0.36) / 3 = 0.77 / 3 = 0.256666…

RMSE = √0.256666… RMSE ≈ 0.5066

Final scores for Candidate A

  • MAE = 0.50
  • RMSE ≈ 0.5066

Candidate B predictions

B₁ = 38.7 B₂ = 37.9 B₃ = 36.8

Step 1 — Compute individual errors

🔹 For T₁ = 40, B₁ = 38.7

Error = 38.7 − 40 = −1.3 Absolute Error = |−1.3| = 1.3 Squared Error = (-1.3)² = 1.69

🔹 For T₂ = 39, B₂ = 37.9

Error = 37.9− 39 = −1.1 Absolute Error = |−1.1| = 1.1 Squared Error = (−1.1)² = 1.21

🔹 For T₃ = 38, B₃ = 36.8

Error = 36.8 − 38 = -1.2 Absolute Error = |−1.2| = 1.2 Squared Error = (−1.2)² = 1.44

Step 2 — Calculate MAE for B

MAE = (1.3 + 1.1 + 1.2) / 3 = 3.6 / 3 = 1.2

Step 3 — Calculate RMSE for B

MSE = (1.69 + 1.21 + 1.44) / 3 = 4.34 / 3 = 1.446667…

RMSE = √1.446667… RMSE ≈ 1.20277

Final scores for Candidate B

  • MAE = 1.2
  • RMSE ≈ 1.20277

this shows lower is better🤔

User avatar
keystats
Mount Kenya University

The leaderboard has been mysterious since day 1 ...i personally even got frustrated and stopped submitting like 20 days ago cause locally i was getting mae below .5 everytime but when i submit i was seeing wonders😂.

User avatar
keystats
Mount Kenya University

Trust them they will pay off eventually if they are good

User avatar
Koleshjr
Multimedia university of kenya

please read the article I just linked. They do normalization before they show the scores on the Leaderboard. In the traditional sense both your explanations make sense but then they normalize the scores to maximize the scores even for the minimization objectives.

The article clearly states that they do normalization so that all metrics are maximized, meaning higher is better!

User avatar
keystats
Mount Kenya University

📷

closer to zero tops the lb

User avatar
Koleshjr
Multimedia university of kenya

but the normalization doesn't seem right in this case

User avatar
CodeJoe

That is the problem from the start. I think they haven't renewed the board yet for older submissions. Try posting a submission you posted before.