Primary competition visual

Amini Soil Prediction Challenge

Helping Africa
$7 000 USD
Completed (9 months ago)
Prediction
Earth Observation
1061 joined
339 active
Starti
Apr 02, 25
Closei
Jun 22, 25
Reveali
Jun 23, 25
A Doubt
Data · 23 May 2025, 12:16 · 3

so according to the Variabledefenitions.csv the PID is the Unique identifier of the soil sample site but in the satellite data files there are multiple longitiude,latitude values for a single PID , what does that mean, did they take the same sample from different places or what ? is it me who is wrong , if it is me please correct me zindians, and if you would like to then please share us the method u used to merge the satellite data files with the train and test files , Thank you

Discussion 3 answers
User avatar
rapsoj
University of Oxford

What's the largest difference in lat/lon for a single PID you have observed?

The largest one I could find in the Landsat data (quick check) is about 16m, so I think for practical purposes corresponds to the same area. I wouldn't worry about this. Join on PID if that is easiest for you.

-----

Calculation:

We know Earth's circumference is 40,075,000m and there are 360 degrees in a circle. Thus each degree of lattitude is 40,075,000m / 360 degrees = 111,320 m/degree. The largest change in latitude I observed for the same PID was 0.000152 degrees, which converted to metres would be 0.000152 degrees * 111,320 m/degree = 16.9m.

23 May 2025, 15:01
Upvotes 1

you should consider both longitude and latitude the distance doesnt seem to be as small as 17 meters ,try it but if im wrong pls let me know

User avatar
rapsoj
University of Oxford

I think the reason you are seeing these discrepancies is because the satellite data comes from pixels that normally have resolution between 30m-1km. Likely what has happened is that Zindi has taken the centroid of each pixel from the remote sensing data and used that for the table. The PID is probably just the closest target ID to that centroid. Thus the linked data is the best available sateillte data for the target.