💻 Trending Now: For Transparency

AI/ML for 5G-Energy Consumption Modelling by ITU AI/ML in 5G Challenge

20 000 CHF

Completed (over 2 years ago)

Skills you will learn

Prediction

1037 joined

277 active

Info Data Chat Leaderboard

Start

Jul 26, 23

Oct 13, 23

Reveal

Oct 17, 23

Koleshjr

Multimedia university of kenya

For Transparency

Platform · 13 Oct 2023, 08:38 · 2

Hello Zindians,

@Zindi @nicolapiovesan

I would love to make a suggestion. As you all already know , using future values is prohibited and for example someone submits a solution that uses future values whether knowingly or unknowingly, we as participants won't know for sure since after the competition ends, the evaluation happens by the organization sponsoring the challenge.

With all the uncertainities and gray area of these "future" values , and whether someone has used future values or not, I am suggesting that the top 10 or 20 solutions that will be picked after the private leaderboard are posted here on the discussions , so that when a solution is disqualified or passes we as the community know for sure why it was disqualified or why it passed. This will enhance transparency in the evaluation process and will ensure public scrutiny.

I understand that this dataset would have been sampled properly to avoid this situation , but until we as the community or participants know why a certain solution passed and which never did , then we will get satisfied and know better ways of handling this kind of situations in future.

This is my own personal opinion and its just a humble suggestion. I don't know what are the opinions of other participants. Feel free to air out your thoughts

Discussion 2 answers

Juliuss

Freelance

Very good thoughts @Koleshjr. Always on point.

13 Oct 2023, 09:03

Upvotes 1

rafael_zimmermann

The point is that the sponsor made it clear that the competition aims to have algorithms that solve real-world problems. That said, any kind of data leakage makes the solution not functional in reality. Not only future data leakage, but any kind of leakage that causes patterns to make that solution unviable in the real world, because those patterns were only available in the training or test data due to some mistake. Therefore, any mistake that could lead to better scores, but not bring a real-world solution should be considered data leakage. So, for transparency, it would be interesting to have clear rules or at least the notion that the real-world solution will be the one to win the competition.

Another point that stands out is that trusting internally that competition participants won't commit these errors or engage in data leakage is a naive way to approach the issue. We know that in reality this can be done either voluntarily or involuntarily. As highlighted, there's a large gray area in this matter, and perhaps only the sponsors truly know what constitutes data leakage or not. Therefore, it's important to consider this.

13 Oct 2023, 09:17

Upvotes 3

Join the largest network for
data scientists and AI builders

About FAQs

Status