AirQo Ugandan Air Quality Forecast Challenge
$5,000 USD
Predict future air quality levels and empower communities to plan and protect their health
740 data scientists enrolled, 318 on the leaderboard
14 March—31 May
Handling missing values
published 21 May 2020, 13:37

Any suggestions on handling null values, there seem to be too many to replace them with mean or median or dropping the row(sample).

If you are using lightgbm or similar packages, I would suggest you to leave them as it is, they can handle NaN values by default, quite well. Another tip could be using a very large negative number say -99999, so that its handled it completely different from other values. You can also try to polate the values. Mean, Median Mode imputation, I would not suggest in this case.