around 1.14 was the MAE, but we even made some features post that, so i think it would go till 0.9 ish as opposed to 0.78 with ANN. I dont have the MAPE benchmarks as all the iteration was on MAE. Also smoothening load feature was the key, lots of noise in the load feature. But there are many more intricacies, stay tuned for the solution.
This seems to be right, my best score was with ANN, but I did not give it so much attention at that time because the scores of some ensemble models misleaded me.
Congratulations @Krishna_Priya, you are incredible. If there's anything I've learned from your achievements in the past few weeks, it's your ability to take enough time to read up instructions and accompanying materials. Cheers again, a well deserved win. 🥂
Thanks for your presentation today. Your code uses the default implementation of the Savitzky–Golay_filter in Scipy, which actually uses the future load data to do the smoothing. As we all know, the load is highly correlated with the energy, you leak some of the target value by mistakes. That's why the load_smooth is one of the most important features.
I understand your doubt. Let me help you understand it.
By definition, Savitzky–Golay_filter theoretically can have a window and Wikipedia has the theory explained. However, if you go inside the scipy implementation of the function, you find that they already implement the function such that it takes a window for a data point at position i,
You will see that the first element of the filtered array is also increased. Why did it change? Because the algorithm uses elements of i-2,i-1,i,i+1,i+2 to fill the value of i, based on the window of 5. That's where future data is used. You can see it clearly in the Wikipedia dynamic picture.
Hi @Koleshjr, I will share the code soon, however a small detail:
ANNs worked much better than any other algorithm and could generalize better for me.
Wow so ANNs were the key to this competition ? Wish we tried it 😂
yeah, as stated in their papers :)
What's the best score you could get with gbdt?
around 1.14 was the MAE, but we even made some features post that, so i think it would go till 0.9 ish as opposed to 0.78 with ANN. I dont have the MAPE benchmarks as all the iteration was on MAE. Also smoothening load feature was the key, lots of noise in the load feature. But there are many more intricacies, stay tuned for the solution.
Coool thanks Krishna , can't wait for the whole solution
This seems to be right, my best score was with ANN, but I did not give it so much attention at that time because the scores of some ensemble models misleaded me.
Congratulations @Krishna_Priya, you are incredible. If there's anything I've learned from your achievements in the past few weeks, it's your ability to take enough time to read up instructions and accompanying materials. Cheers again, a well deserved win. 🥂
Thank you for the kind words @Professor, and yes, your observation is absolutely correct :)
Hi, @Krishna_Priya,
Thanks for your presentation today. Your code uses the default implementation of the Savitzky–Golay_filter in Scipy, which actually uses the future load data to do the smoothing. As we all know, the load is highly correlated with the energy, you leak some of the target value by mistakes. That's why the load_smooth is one of the most important features.
Please refer to https://en.wikipedia.org/wiki/Savitzky%E2%80%93Golay_filter to see details of the Savitzky–Golay_filter.
I like your approach of ANN, but I feel I should let you know about the potential data leakage there. Just my two cents.
Hi @newbee,
I understand your doubt. Let me help you understand it.
By definition, Savitzky–Golay_filter theoretically can have a window and Wikipedia has the theory explained. However, if you go inside the scipy implementation of the function, you find that they already implement the function such that it takes a window for a data point at position i,
(i-window_length,i)
scipy implemented filter: https://github.com/scipy/scipy/blob/v1.11.3/scipy/signal/_savitzky_golay.py#L230-L357
Please refer to the lines in the order
1. line 230
def savgol_filter(x, window_length, polyorder, deriv=0, delta=1.0,
axis=-1, mode='interp', cval=0.0)
2. line 215 and 226
def _fit_edges_polyfit(x, window_length, polyorder, deriv, delta, axis, y):
_fit_edge(x, n - window_length, n, n - halflen, n, axis,
polyorder, deriv, delta, y)
3. line 171
def _fit_edge(x, window_start, window_stop, interp_start, interp_stop,
axis, polyorder, deriv, delta, y):
Nowhere in the code does it do any manipulation based on window length /2 +1 or i+1
Regards,
Krishna Priya
Hi @Krishna Priya,
Thanks for this discussion. It will improve our DS skills and knowledge.
For the code, I believe the part that uses the future data (i+1,i+2m..) is at this line:
line 351. y = convolve1d(x, coeffs, axis=axis, mode="constant").
Without going into too many detailed implementations, let's do a simple test.
x = np.array([2, 2, 5, 2, 1, 0, 1, 4, 9])
savgol_filter(x, 5, 2)
Result:
array([1.65714286, 3.17142857, 3.54285714, 2.85714286, 0.65714286, 0.17142857, 1. , 4. , 9. ])
If we change the second element of the x and do it again,
x = np.array([2, 3, 5, 2, 1, 0, 1, 4, 9])
savgol_filter(x, 5, 2)
Result:
array([1.91428571, 3.54285714, 3.88571429, 2.77142857, 0.65714286, 0.17142857, 1. , 4. , 9. ])
You will see that the first element of the filtered array is also increased. Why did it change? Because the algorithm uses elements of i-2,i-1,i,i+1,i+2 to fill the value of i, based on the window of 5. That's where future data is used. You can see it clearly in the Wikipedia dynamic picture.
Wow I am learning something new.