I think you just have to play with the results from models. For example, if you feel the model is giving too many high values, you reduce it by a multiplier.
#Assuming sub is your submission dataframe
sub['kwh'] = sub['kwh'] * 0.95)
sub.to_csv('postprocess_trick_sub.csv', index=False)
You decide to choose the multiplier whether 1.02, 0.95, 0.98 and so on. But take note, post processing is prone to overfitting to the public board🥲 . This is just an example. You can also look at the distribution and adjust the values to a point. Maybe capping the corresponding kwh values to a consumer device below a number because maybe in the train set it was around a particular value. That is basically post processing. I don't really advise using it unless you are certain you are doing the right thing basically.
The million ... uhhh ... 1500 dollar question ...
😁😁
I think you just have to play with the results from models. For example, if you feel the model is giving too many high values, you reduce it by a multiplier.
#Assuming sub is your submission dataframe sub['kwh'] = sub['kwh'] * 0.95) sub.to_csv('postprocess_trick_sub.csv', index=False)You decide to choose the multiplier whether 1.02, 0.95, 0.98 and so on. But take note, post processing is prone to overfitting to the public board🥲 . This is just an example. You can also look at the distribution and adjust the values to a point. Maybe capping the corresponding kwh values to a consumer device below a number because maybe in the train set it was around a particular value. That is basically post processing. I don't really advise using it unless you are certain you are doing the right thing basically.
Cool