Primary competition visual

GeoAI Ground-level NO2 Estimation Challenge by ITU

Helping Italy
1 000 CHF
Challenge completed 12 months ago
Prediction
804 joined
372 active
Starti
May 22, 24
Closei
Nov 15, 24
Reveali
Nov 15, 24
missing value in data
Help · 29 Jul 2024, 09:05 · 3

I need some guidance on how to handle missing values in this dataset. What are the best practices for filling in these missing values?

Discussion 3 answers
User avatar
A7med7
University of khartoum

there are basic methods if you just want to fit a model

1. fill in with central values [mean. or median, mode]

none of the guys here would recommend that. it's not always a good idea to do so.

2. fill in with sort -of advanced methods.

you can use KNN to map the missing values to similiar data points in the feature space and copy it to fill the missing values, or iterative imputer, both are exist in sklearn library and easily implemented you can check it.

3. leave it to the model.

sometimes the missing value itself can be an information, or can be filled/automatically computed/estimated by a model, there are some models that can work with misisng values. e.g [xgboost,catboost, lightgbm]

good luck.

29 Jul 2024, 10:24
Upvotes 4

Thanks for the info.

je suis d'accord mais au niveau du modèle de Xgboost , il n'apprend pas tellement si les données sont telles remises au model