Hello folks!
What is the correct order for: Filling Nan's, scaling and feature engineering?
What do you usually do first? My order is: Crafting new features -> Scale -> Fill Nan's
Do you have any strict explanation?
I fill NAs first then scale and engineer new features. But this also depends on the extent of the NAs.
Thanks!
I fill NA first and then scale. The decision of feature engineering priority is dependent on the data.
From my experience CatBoost predict more accurate when I don't fill NaN's at all, that's why I have to develop new features without filling NaN's
This is because it natively handles NaNs and categorical values. But other models would need this taken care of beforehand.
I don't fill NaNs for decision tree models. For NN, it is necessary, and if sklearn scalers are used, then have to impute nulls before scaling. The big headache is what to fill Nans with.
I fill NAs first then scale and engineer new features. But this also depends on the extent of the NAs.
Thanks!
I fill NA first and then scale. The decision of feature engineering priority is dependent on the data.
Thanks!
From my experience CatBoost predict more accurate when I don't fill NaN's at all, that's why I have to develop new features without filling NaN's
This is because it natively handles NaNs and categorical values. But other models would need this taken care of beforehand.
I don't fill NaNs for decision tree models. For NN, it is necessary, and if sklearn scalers are used, then have to impute nulls before scaling. The big headache is what to fill Nans with.