Hey guys,
Can we use hour values from the time column to train our model because it is clear that energy consumption in the morning hours is less as compared to the remaining hours of the day? Using the hour feature should also increase the accuracy of the model in predicting energy Consumption. I am asking this question because in previous discussions it was mentioned that we shouldn't have to use future values. Can anyone explain what exactly 'future values' means here?
Future values means, if you are predicting for time 1300 hrs you can not use >1300hrs information because you won't be having that at inference time. for example shift(-1)
Does this mean we have to take separate data for each timestamp (like data <1300 hrs for predicting values on 1300 hrs and so on) and predict according to that? In that case, how can it be possible? because there are 63 unique() time stamps in the time column. Do we have to take separate data for all of them and train on 63 different models, predict, and then combine the rows? Please guide me as I am a beginner in data science and machine learning
Take a look at this discussion
https://zindi.africa/competitions/aiml-for-5g-energy-consumption-modelling/discussions/18596
Krishna has explained it so well
""" However, we can train on the complete data, as training on complete data will capture the instantaneous physical relationships between the target and independent KPIs, so for the test samples of the first hours your prediction will be based on these instantaneous relationships."""