💻 Challenge Chat: Leadeboard and future values

AI/ML for 5G-Energy Consumption Modelling by ITU AI/ML in 5G Challenge

20 000 CHF

Challenge completed ~2 years ago

Skills you will learn

Prediction

989 joined

278 active

Info Data Chat Leaderboard

Start

Jul 26, 23

Oct 13, 23

Reveal

Oct 17, 23

LROUZZ

Ecole polytechnique de tunisie

Leadeboard and future values

Platform · 9 Oct 2023, 18:09 · 24

Hello,

I've just seen a lot of ridiculous scores, so I tried using future values, and my score decreased from 1.32 without feature values to 0.98. In the final selection, I will choose the predictions without the future values, but I want to inform @nicolapiovesan to check the first 20 solutions on the leaderboard.

Best regards,

Discussion 24 answers

mmhiri

Good point ! But why 20 not 30 or even 50!

9 Oct 2023, 18:15

Upvotes 0

mmhiri

Best solution in my opinion: it is to allow usine future data and extend the deadline one month. @nicolapiovesan

9 Oct 2023, 18:23

Upvotes 0

University of Yaoundé I

What do yo mean by "it is to allow usine future data"?

replied to mmhiri9 Oct 2023, 18:39

Upvotes 0

Rajat_Ranjan

Allstate

I guess, it should be based on the rules of the competition, we know there is a Gray area, but the hosts should comment on this and clear the doubt so that we can share the correct solution.

9 Oct 2023, 18:35

Upvotes 0

Koleshjr

Multimedia university of kenya

The host has done this severally though. Their stand is do not use future values in your final submissions as the solutions will be disqualified

replied to Rajat_Ranjan9 Oct 2023, 21:56

Upvotes 3

nicolapiovesan

Hi, thanks for pointing out this problem.

As stated in many discussions, the goal of the challenge is to model how multiple instantaneous features collected in each hour affect the energy consumption in such hour, and it must be clear that using future values as input does not make sense, as in the real world such values will not be available.

To answer your question, could the current top 10 participants in the leaderboard please confirm if they are using or not future values in their solutions? @Yisakberhanu, @rafael_zimmermann, @Krishna_Priya, @NxGTR, @LROUZZ, @heyyou, @tomy4reel, @imakarov, @Koleshjr, @Hakim04

Finally, I'd like to remind that, at the end of the competition, the top participants will be required to submit a report and the code to train/test the model, which will be used to provide the final score. Solutions in which future values are taken as inputs of the model will not be considered.

11 Oct 2023, 13:44

Upvotes 0

Koleshjr

Multimedia university of kenya

Our current score uses future values, but we won't select that since as you have already clarified Many times that they won't be considered and thank you for confirming that again

replied to nicolapiovesan11 Oct 2023, 14:07

Upvotes 2

rafael_zimmermann

Thank you for bringing up this important issue. To be fully transparent, my best score on the leaderboard does indeed involve the use of future values. However, as you clearly outlined in the competition guidelines, only models that do not use future data will be considered for final submissions and validation. The focus is truly on creating a model that is applicable in the real world, where such future data would not be available.

replied to nicolapiovesan11 Oct 2023, 16:56

Upvotes 0

Krishna_Priya

My team's current best score on LB does NOT use future value features as input to the model. As this rule was already established a month back, I stopped creating features using future values.

replied to nicolapiovesan11 Oct 2023, 17:05

Upvotes 0

heyyou

Ecole polytechnique de tunisie

I'm completely confident that no model can achieve a score below 1.2 without using future data, let alone get down to 0.8. Just take a look at the feature importance plot to see for yourself.

replied to Krishna_Priya11 Oct 2023, 17:32

Upvotes 1

Koleshjr

Multimedia university of kenya

What feature engineering are they doing which we aren't , this is so demotivating haha,

replied to heyyou11 Oct 2023, 17:41

Upvotes 1

tomy4reel

Nexford University

please does this include aggregate base station features like mean, median, std....

replied to nicolapiovesan11 Oct 2023, 17:47

Upvotes 0

Krishna_Priya

If you decide to calculate aggregate features, ideally any central tendency should be calculated using values of the past. you should not just calculate the mean without filtering the data.

PS: This is my opinion, otherwise it would just be an alternate way to leak the future data.

replied to tomy4reel11 Oct 2023, 17:53

Upvotes 2

tomy4reel

Nexford University

I absolutely agree

replied to Krishna_Priya11 Oct 2023, 17:59

Upvotes 0

Koleshjr

Multimedia university of kenya

So @Krishna_priya your current score , the aggregate are from the previous hours ???

replied to Krishna_Priya11 Oct 2023, 18:03

Upvotes 0

Krishna_Priya

Hey @Koleshjr, For now, I cannot comment on whether I am using agg features, but yes any feature being used only has the data from the previous hours.

replied to Koleshjr11 Oct 2023, 18:09

Upvotes 0

Koleshjr

Multimedia university of kenya

Damn you are good👏👏 but we will get there with time.

replied to Krishna_Priya11 Oct 2023, 18:15

Upvotes 0

Krishna_Priya

All the best bro. Let's keep learning from each other. Anyway, we will see a lot of shuffling in the private leaderboard in this one. Fingers crossed, May the best approach win.

replied to Koleshjr11 Oct 2023, 18:21

Upvotes 1

yanteixeira

@tomy4reel you have to use .shift(1) to ensure no data leakage.

replied to Krishna_Priya12 Oct 2023, 00:36

Upvotes 1

Yisakberhanu

wachemo university

yes, i used future value but there is not much difference

replied to nicolapiovesan12 Oct 2023, 04:10

Upvotes 0

University of Yaoundé I

Looking forward to seeing the best solutions and/or approaches. On this one, I'm completely lost 🙌🏿

replied to Krishna_Priya12 Oct 2023, 09:19

Upvotes 0

Koleshjr

Multimedia university of kenya

Me too @ff 😂😂 I have given up seeing people getting 0.82 with no future values whatttt!!! That's freaking impressive tbh and I don't think I can get there Even if I was added 30 more days 😂

replied to ff12 Oct 2023, 09:47

Upvotes 0

University of Yaoundé I

😂😂 Day after tomorrow there will be a terrible shake up in the ranking!

replied to Koleshjr12 Oct 2023, 10:59

Upvotes 0

rafael_zimmermann

The use of future data is a problem that can be subjective if the rules aren't clear, such as issues related to aggregation or how to handle null values. It's not necessarily just about using lead functions; training on the complete dataset is also a form of using future data. It would be interesting and fair to have an objective rule to justly choose the top 10.

replied to nicolapiovesan12 Oct 2023, 15:45

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status