⚠️ Join the Buzz: Irregular distribution of loca...

GeoAI Challenge for Air Pollution Susceptibility Mapping by ITU

Helping Italy

$1 000 USD

Completed (almost 3 years ago)

Skills you will learn

Forecast

223 joined

35 active

Info Data Chat Leaderboard

Start

Jul 21, 23

Oct 14, 23

Reveal

Oct 14, 23

Junnh

Jomo Kenyatta University of Agriculture and Technology

Irregular distribution of location-data

Data · 24 Jul 2023, 12:10 · 3

Looking through the train dataset, as well as the air-pollution dataset and seasonal meteorological dataset, i noticed an irregularity, in which the train dataset has been recorded primarily from the metropolitan area and not the city itself, while the air-pollution and seasonal meteorological datasets are concentrated towards the city and not within the metropolitan area. The locations in both the air-pollution and all 4 of the seasonal datasets do not match with any of the locations in the train dataset.

Is this correct ? Is this the expectation? If yes, won't it make it difficult to combine any of the external datasets to the train dataset?

Discussion 3 answers

Muliasi

I noticed the same here, but I hope, of which am not sure of, we have to apply one of the Interpolation Techniques to creat a new training data- Inverse Distance Weighting (IDW)I feel will be effective.

24 Jul 2023, 19:40

Upvotes 0

apugliese

The training dataset has been recorded primarily from the metropolitan area and not the city itself because there are more training points in the metropolitan area and the model can be later applied to the city itself. That is the reason why the seasonal datasets are provided only for the city.

The locations in the seasonal datasets do not match with the ones of the training dataset because the training is provided only at in-situ station points while the seasonal datasets are a regular grid of interpolated data.

26 Jul 2023, 09:12

Upvotes 1

xiaoironman

the train dataset seems to have many identical entries, for example the 1sr and 2nd row, the 3rd and 4th row, is there any special reason for that or we should actually remove the repeated data?

replied to apugliese29 Jul 2023, 17:33

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status