Primary competition visual

AI/ML for 5G-Energy Consumption Modelling by ITU AI/ML in 5G Challenge

20 000 CHF
Challenge completed ~2 years ago
Prediction
1019 joined
278 active
Starti
Jul 26, 23
Closei
Oct 13, 23
Reveali
Oct 17, 23
User avatar
Koleshjr
Multimedia university of kenya
What was the TRICK?
Platform · 19 Oct 2023, 16:02 · 13

Now that the dust has settled, what was the trick for getting scores less than 0.08 with no future values?

@Krishna_Priya @Yisakberhanu @LROUZZ @rafael_zimmermann

Discussion 13 answers
User avatar
Krishna_Priya

Hi @Koleshjr, I will share the code soon, however a small detail:

ANNs worked much better than any other algorithm and could generalize better for me.

19 Oct 2023, 17:41
Upvotes 2
User avatar
Koleshjr
Multimedia university of kenya

Wow so ANNs were the key to this competition ? Wish we tried it 😂

User avatar
Krishna_Priya

yeah, as stated in their papers :)

User avatar
Koleshjr
Multimedia university of kenya

What's the best score you could get with gbdt?

User avatar
Krishna_Priya

around 1.14 was the MAE, but we even made some features post that, so i think it would go till 0.9 ish as opposed to 0.78 with ANN. I dont have the MAPE benchmarks as all the iteration was on MAE. Also smoothening load feature was the key, lots of noise in the load feature. But there are many more intricacies, stay tuned for the solution.

User avatar
Koleshjr
Multimedia university of kenya

Coool thanks Krishna , can't wait for the whole solution

User avatar
SaltigAI

This seems to be right, my best score was with ANN, but I did not give it so much attention at that time because the scores of some ensemble models misleaded me.

User avatar
Professor

Congratulations @Krishna_Priya, you are incredible. If there's anything I've learned from your achievements in the past few weeks, it's your ability to take enough time to read up instructions and accompanying materials. Cheers again, a well deserved win. 🥂

User avatar
Krishna_Priya

Thank you for the kind words @Professor, and yes, your observation is absolutely correct :)

Hi, @Krishna_Priya,

Thanks for your presentation today. Your code uses the default implementation of the Savitzky–Golay_filter in Scipy, which actually uses the future load data to do the smoothing. As we all know, the load is highly correlated with the energy, you leak some of the target value by mistakes. That's why the load_smooth is one of the most important features.

Please refer to https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter to see details of the Savitzky–Golay_filter.

I like your approach of ANN, but I feel I should let you know about the potential data leakage there. Just my two cents.

27 Oct 2023, 16:09
Upvotes 2
User avatar
Krishna_Priya

Hi @newbee,

I understand your doubt. Let me help you understand it.

By definition, Savitzky–Golay_filter theoretically can have a window and Wikipedia has the theory explained. However, if you go inside the scipy implementation of the function, you find that they already implement the function such that it takes a window for a data point at position i,

(i-window_length,i)

scipy implemented filter: https://github.com/scipy/scipy/blob/v1.11.3/scipy/signal/_savitzky_golay.py#L230-L357

Please refer to the lines in the order

1. line 230

def savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0,

axis=-1, mode='interp', cval=0.0)

2. line 215 and 226

def _fit_edges_polyfit(x, window_length, polyorder, deriv, delta, axis, y):

_fit_edge(x, n - window_length, n, n - halflen, n, axis,

polyorder, deriv, delta, y)

3. line 171

def _fit_edge(x, window_start, window_stop, interp_start, interp_stop,

axis, polyorder, deriv, delta, y):

Nowhere in the code does it do any manipulation based on window length /2 +1 or i+1

Regards,

Krishna Priya

Hi @Krishna Priya,

Thanks for this discussion. It will improve our DS skills and knowledge.

For the code, I believe the part that uses the future data (i+1,i+2m..) is at this line:

line 351. y = convolve1d(x, coeffs, axis=axis, mode="constant").

Without going into too many detailed implementations, let's do a simple test.

x = np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])

savgol_filter(x, 5, 2)

Result:

array([1.65714286, 3.17142857, 3.54285714, 2.85714286, 0.65714286, 0.17142857, 1. , 4. , 9. ])

If we change the second element of the x and do it again,

x = np.array([2, 3, 5, 2, 1, 0, 1, 4, 9])

savgol_filter(x, 5, 2)

Result:

array([1.91428571, 3.54285714, 3.88571429, 2.77142857, 0.65714286, 0.17142857, 1. , 4. , 9. ])

You will see that the first element of the filtered array is also increased. Why did it change? Because the algorithm uses elements of i-2,i-1,i,i+1,i+2 to fill the value of i, based on the window of 5. That's where future data is used. You can see it clearly in the Wikipedia dynamic picture.

User avatar
ziki1414
International University of Applied Science, Bad honnef, Germany

Wow I am learning something new.