🌾 AI in Focus: I think I don't understand the...

agriBORA Commodity Price Forecasting Challenge

Helping Kenya

€8 250 EUR

Completed (2 months ago)

Skills you will learn

Data analysis

GIS

Time-series

Forecasting

Nowcasting

1230 joined

365 active

Info Data Chat Leaderboard

Start

Nov 14, 25

Dec 27, 25

Reveal

Jan 13, 26

CodeJoe

I think I don't understand the LB now

Platform · 9 Dec 2025, 17:50 · 15

This is strange.

Discussion 15 answers

Koleshjr

Multimedia university of kenya

🌾 Join the Buzz: Clarification on leaderboard s... - 272 Views

9 Dec 2025, 17:55

Upvotes 0

CodeJoe

Watch it carefully.

For example @keystats has an RMSE of 1.4545 and an MAE of 1.1750

and I have an RMSE of 3.4738 and an MAE of 2.573709476.

So it means @keystats must be ahead of me on the leaderboard even with the formula from @J0NNY in that chat.

Am I missing something?

replied to Koleshjr9 Dec 2025, 18:01

Upvotes 0

Koleshjr

Multimedia university of kenya

It is indeed messed up but maybe that is not the best example to use as that is the right behavior. Higher is better . So you should be on top of him but @micadee for example has lower scores and in second . That shouldnt be the case. So it is indeed confusing!

replied to CodeJoe9 Dec 2025, 18:10

Upvotes 1

CodeJoe

I’m getting higher RMSE and MAE scores than him, which should normally correspond to a lower performance score, right? Conversely, if I had lower RMSE and MAE, I would expect a higher score.

At the moment, the leaderboard seems quite off, which is really confusing. I think I’ll wait for the last two weeks before drawing any conclusions, when the results will be more meaningful.

replied to Koleshjr9 Dec 2025, 18:20

Upvotes 0

Koleshjr

Multimedia university of kenya

I mean in the traditional sense, Yes but the score you are seeing on the LB is not the actual mae/ rmses . Those have already been normalized.

💻 Introducing Multi-Metric Evaluation, or One Metric to Rule them All

Read the above article. It says:

All metric scores are normalised before being shown on the leaderboard. This ensures fairness when a challenge includes both metrics you want to maximise (such as Accuracy) and metrics you want to minimise (such as Log Loss).

So the score has already been normalized, I guess.

But still that doesn't mean the lb is not messed up , it is but in a mixed way. The calculation is somehow wrong I dont know

replied to CodeJoe9 Dec 2025, 18:27

Upvotes 1

CodeJoe

I think I get you now. The recent scores are on the new dummy validation scores from the starter notebook?

That would make sense then. I just tested that. I resubmitted an old result and it gave a better score than the previous submission

replied to Koleshjr9 Dec 2025, 18:44

Upvotes 0

Koleshjr

Multimedia university of kenya

even if we are being evaluated on dummy data , then there should not be a mixup for example. If higher is better then that should be applied consistently Or @AJoel If I was to advise , it is better rescoring the LB and retaining those 2weeks score until the next two weeks(so even for these two weeks we are predicting for week 50 and 51) we still get evaluated based on week 48 and 49 until the next rolling values come out and so on. That way , we won't be evaluated on dummy data. Or better yet(as a competitor), ignore the LB completely and use the published 48 and 49 values already published to check your scores locally.

replied to CodeJoe9 Dec 2025, 18:57

Upvotes 1

CodeJoe

True, the leaderboard makes our local cross validation difficult to even trust😅

replied to Koleshjr9 Dec 2025, 19:36

Upvotes 0

keystats

Mount Kenya University

Somehow i disagree with your claim higher is better but i stand to be corrected .... lets say for example we have 3 true values and have two candidates A&B

True values

T₁ = 40
T₂ = 39
T₃ = 38

✅ Candidate A predictions

A₁ = 39.5 A₂ = 38.6 A₃ = 37.4

Step 1 — Compute individual errors

🔹 For T₁ = 40, A₁ = 39.5

Error = 39.5 − 40 = −0.5 Absolute Error = |−0.5| = 0.5 Squared Error = (−0.5)² = 0.25

🔹 For T₂ = 39, A₂ = 38.6

Error = 38.6 − 39 = −0.4 Absolute Error = |−0.4| = 0.4 Squared Error = (−0.4)² = 0.16

🔹 For T₃ = 38, A₃ = 37.4

Error = 37.4 − 38 = −0.6 Absolute Error = |−0.6| = 0.6 Squared Error = (−0.6)² = 0.36

⭐ Candidate A summary of individual errors

PointErrorAbs ErrorSq Error1−0.50.50.252−0.40.40.163−0.60.60.36

Step 2 — Calculate MAE for A

MAE = (0.5 + 0.4 + 0.6) / 3 = 1.5 / 3 = 0.5

Step 3 — Calculate RMSE for A

MSE = (0.25 + 0.16 + 0.36) / 3 = 0.77 / 3 = 0.256666…

RMSE = √0.256666… RMSE ≈ 0.5066

✅ Final scores for Candidate A

MAE = 0.50
RMSE ≈ 0.5066

✅ Candidate B predictions

B₁ = 38.7 B₂ = 37.9 B₃ = 36.8

Step 1 — Compute individual errors

🔹 For T₁ = 40, B₁ = 38.7

Error = 38.7 − 40 = −1.3 Absolute Error = |−1.3| = 1.3 Squared Error = (-1.3)² = 1.69

🔹 For T₂ = 39, B₂ = 37.9

Error = 37.9− 39 = −1.1 Absolute Error = |−1.1| = 1.1 Squared Error = (−1.1)² = 1.21

🔹 For T₃ = 38, B₃ = 36.8

Error = 36.8 − 38 = -1.2 Absolute Error = |−1.2| = 1.2 Squared Error = (−1.2)² = 1.44

Step 2 — Calculate MAE for B

MAE = (1.3 + 1.1 + 1.2) / 3 = 3.6 / 3 = 1.2

Step 3 — Calculate RMSE for B

MSE = (1.69 + 1.21 + 1.44) / 3 = 4.34 / 3 = 1.446667…

RMSE = √1.446667… RMSE ≈ 1.20277

✅ Final scores for Candidate B

MAE = 1.2
RMSE ≈ 1.20277

this shows lower is better🤔

replied to Koleshjr9 Dec 2025, 19:48

Upvotes 0

keystats

Mount Kenya University

The leaderboard has been mysterious since day 1 ...i personally even got frustrated and stopped submitting like 20 days ago cause locally i was getting mae below .5 everytime but when i submit i was seeing wonders😂.

replied to CodeJoe9 Dec 2025, 19:57

Upvotes 1

keystats

Mount Kenya University

Trust them they will pay off eventually if they are good

replied to CodeJoe9 Dec 2025, 20:01

Upvotes 0

Koleshjr

Multimedia university of kenya

please read the article I just linked. They do normalization before they show the scores on the Leaderboard. In the traditional sense both your explanations make sense but then they normalize the scores to maximize the scores even for the minimization objectives.

The article clearly states that they do normalization so that all metrics are maximized, meaning higher is better!