there are basic methods if you just want to fit a model
1. fill in with central values [mean. or median, mode]
none of the guys here would recommend that. it's not always a good idea to do so.
2. fill in with sort -of advanced methods.
you can use KNN to map the missing values to similiar data points in the feature space and copy it to fill the missing values, or iterative imputer, both are exist in sklearn library and easily implemented you can check it.
3. leave it to the model.
sometimes the missing value itself can be an information, or can be filled/automatically computed/estimated by a model, there are some models that can work with misisng values. e.g [xgboost,catboost, lightgbm]
there are basic methods if you just want to fit a model
1. fill in with central values [mean. or median, mode]
none of the guys here would recommend that. it's not always a good idea to do so.
2. fill in with sort -of advanced methods.
you can use KNN to map the missing values to similiar data points in the feature space and copy it to fill the missing values, or iterative imputer, both are exist in sklearn library and easily implemented you can check it.
3. leave it to the model.
sometimes the missing value itself can be an information, or can be filled/automatically computed/estimated by a model, there are some models that can work with misisng values. e.g [xgboost,catboost, lightgbm]
good luck.
Thanks for the info.
je suis d'accord mais au niveau du modèle de Xgboost , il n'apprend pas tellement si les données sont telles remises au model