Primary competition visual

Expresso Churn Prediction Challenge

Helping Senegal
$1 000 USD
Completed (over 4 years ago)
Classification
Prediction
1378 joined
437 active
Starti
Aug 27, 21
Closei
Nov 28, 21
Reveali
Nov 28, 21
Fill Nan's, Scale, Feature engineering order
Help · 11 Nov 2021, 16:13 · 7

Hello folks!

What is the correct order for: Filling Nan's, scaling and feature engineering?

What do you usually do first? My order is: Crafting new features -> Scale -> Fill Nan's

Do you have any strict explanation?

Discussion 7 answers
User avatar
Aaron_Simumba
Entrepreneurs financial centre

I fill NAs first then scale and engineer new features. But this also depends on the extent of the NAs.

11 Nov 2021, 16:42
Upvotes 0

I fill NA first and then scale. The decision of feature engineering priority is dependent on the data.

23 Nov 2021, 18:07
Upvotes 0

From my experience CatBoost predict more accurate when I don't fill NaN's at all, that's why I have to develop new features without filling NaN's

User avatar
Aaron_Simumba
Entrepreneurs financial centre

This is because it natively handles NaNs and categorical values. But other models would need this taken care of beforehand.

I don't fill NaNs for decision tree models. For NN, it is necessary, and if sklearn scalers are used, then have to impute nulls before scaling. The big headache is what to fill Nans with.

25 Nov 2021, 16:02
Upvotes 0