can someone help me with the misssing values, i mean there are alot of missing values do i have to drop them or fill them by mean or median or is there any other way, and also why there are too many missing values in the dataset
you can experiment with replacing with the mean or median as well as with the mode, you can also do a linear interpolation and see the performance of your model, and then you can conclude on the best way to fill in missing values.
a follow up on the answer provided by @kharerim, you can also try out boosting algorithms, like the XGBosst and co, they can handle missing values on their own.
You can also check out past Zindi competitions and see how they tackle data with missing values.
Again it all boils down to experimenting with what work best with you.
you can experiment with replacing with the mean or median as well as with the mode, you can also do a linear interpolation and see the performance of your model, and then you can conclude on the best way to fill in missing values.
thanks kDegrandlac
a follow up on the answer provided by @kharerim, you can also try out boosting algorithms, like the XGBosst and co, they can handle missing values on their own.
You can also check out past Zindi competitions and see how they tackle data with missing values.
Again it all boils down to experimenting with what work best with you.
Hope this help.
thank you DanielTobi0