Primary competition visual

DSN and Microsoft Skills for Job

Helping Nigeria
Knowledge
Challenge completed over 2 years ago
Prediction
435 joined
221 active
Starti
Jun 12, 23
Closei
Aug 06, 23
Reveali
Aug 07, 23
User avatar
cephars
Free Lance
MISSING DATA
Data Ā· 25 Jul 2023, 07:57 Ā· 6

How did you people handled missing data?Just curious to know on the technique you used

Discussion 6 answers
User avatar
Perkins
University of Zimbabwe

which missing data? at branch level or at daily level

25 Jul 2023, 10:02
Upvotes 0

Used the mean for the numerical columns, then I dropped the missing values in the categorical column... Didn't want to use the mode in the categorical column... What are your thoughts?

25 Jul 2023, 10:55
Upvotes 1
User avatar
cephars
Free Lance

I thought of maybe using an algorithm to predict the missing values,maybe linear regression or random forest regressor for numerical values then maybe a classifier algorithm for categorical..or maybe one hot categorical features first then use regressor algorithm to predict missing values..I believe filling the missing values appropriately was the main task for this project..Or even using clustering techniques like knn,,I dont think using stats techniques like mean,mode,median was appropriate for this task since there are many missing values.....my thoughts tho

25 Jul 2023, 11:28
Upvotes 1
User avatar
Perkins
University of Zimbabwe

Look into mice. Multiple Imputation by Chained Equations(MICE), it can be very useful

25 Jul 2023, 11:53
Upvotes 1
User avatar
cephars
Free Lance

Thanks

User avatar
flibbert_debola

I used mean fot the numerical columns and got insight to replace some missing values that satisfy a certain condition for the title column e.g most houses with the price of 3 million and above are mansions, i filled the missing value in that codition with mansion . I used that for the title column ,droped the missing values for loc column

26 Jul 2023, 12:42
Upvotes 1