🛡️ Trending Now: Latitude and Longitude

AirQo African Air Quality Prediction Challenge

$3 000 USD

Completed (~2 years ago)

Skills you will learn

Prediction

1032 joined

513 active

Info Data Chat Leaderboard

Start

Mar 15, 24

Jun 16, 24

Reveal

Jun 16, 24

yanteixeira

Latitude and Longitude

Help · 23 May 2024, 19:59 · 5

Hello fellow Zindians,

I would like to know how you all are dealing with these two features. I have a feeling that it is not correct to treat them as numerical features because GBDTs split one feature at a time. This univariate splitting can miss the complex interaction between latitude and longitude that represents true geographic proximity. However, at the same time, I have not yet found a strong reason not to use them. So far, I have combined the two into one categorical feature.

I have tried countless transformations and new features, but none have convinced me.

Discussion 5 answers

Gabriel_Figueiro

I think that the coordinates caused the model to memorize the patterns of the cities, but when we try to predict on the testing set, it doesn't work because there are different cities.

24 May 2024, 00:51

Upvotes 1

yanteixeira

Good answer. I think the same applies to other features as well.

replied to Gabriel_Figueiro24 May 2024, 00:55

Upvotes 0

Mugisha_

Given that the model will be applied to other locations at inference time, it generally doesn't make sense to train with any location based features even though the data curated seems to encourage it.

On the other hand there's isn't that much pollutant data to usefully train a model to predict pm2_5 concentrations solely relying on pollutant features: so training with latitude and longitude based features is what yields better scores for me.

24 May 2024, 23:01

Upvotes 1

yanteixeira

Funny to see that other participants are also experiencing this dilemma. Super interesting competition so far!

replied to Mugisha_24 May 2024, 23:08

Upvotes 1

Nembot_Jules

Yes I also agree with your point. I also think we need other countries in the train set

replied to Mugisha_25 May 2024, 07:34

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status