Primary competition visual

Urban Air Pollution Challenge by #ZindiWeekendz

Helping Africa
$300 USD
Challenge completed over 5 years ago
Prediction
236 joined
134 active
Starti
Apr 10, 20
Closei
Apr 12, 20
Reveali
Apr 12, 20
SPLIT OF DATA
Notebooks · 11 Apr 2020, 09:28 · 1

I'm trying to understand how the train and test datasets have been split.I think place_id indicates the different regions within which the dataset has been collected.TRain has got 349 different regions and test has got 179 different regions.Regions in train are not included in test.So it means we are building a model using data from different regions and applying that model to predict pollution on other regions.Assuming this regions indicate different cities in Africa.What is the probability that a model trained using Tunis data will accurately forecast air pollution in Mogadishu?Or what is the relationship between the regions since the data doesn't include geolocation data?I think I must have misundersytood something....

Discussion 1 answer

I think focus should be given on the readings. This is why the model performs worse when fed with location data

11 Apr 2020, 13:19
Upvotes 0