ICLR Workshop Challenge #2: Radiant Earth Computer Vision for Crop Detection from Satellite Imagery
$7,000 USD
Identify crop type using satellite imagery, and win a trip to present your work at ICLR 2020 in Addis Ababa.
3 February–15 March 2020 23:59
274 data scientists enrolled, 39 on the leaderboard
Field ids
published 8 Feb 2020, 14:08
edited 27 minutes later

Hi @Johnowhitaker

Sorry for a possibly dumb question. The field id provided has values of 0 also. So those are to be ignored ?

On running this code:

import rasterio

import numpy as np

from rasterio.plot import show

for i in range(4):

fp_path1=f'/content/drive/My Drive/ICLR_SEG/0{i}/{i}_field_id.tif'.format(i)

raster1 = rasterio.open(fp_path1)

band1=raster1.read(1)

print(f'Total number of fields in tile {i} is:'.format(i),np.unique(band1[np.where(band1!=0)]).shape[0])

The output is:

Total number of fields in tile 0 is: 591

Total number of fields in tile 1 is: 757

Total number of fields in tile 2 is: 256

Total number of fields in tile 3 is: 3084

Adding these the total field images are coming to be 4688. That is less than 4797 which is mentioned. What am I missing?

Would request your kind reply?

Hmm, I hadn't noticed the discrepancy. To be honest I haven't looked much at this challenge yet. I had a few missing predictions (must be the missing fields) and just filled the 34 missing values with 1/7 ¯\_(ツ)_/¯

I'll see what's up on Monday - for now, I'd say just ignore those few missing ones.

Thanks a lot for your prompt response

edited 10 minutes later

Thanks Aninda for raising this. I'm seeing the same issue of getting 4,688 as the total for unique Field IDs (except 0).

In the test set, there are these 34 missing unique field IDs in the combined field_id.tif files compared what's listed in the SampleSub.csv:

[64, 140, 274, 354, 784, 917, 985, 1401, 1516, 1581, 2092, 2307, 2546, 2555, 2562, 2812, 3084, 3144, 3318, 3458, 3838, 3965, 3966, 3990, 4096, 4098, 4212, 4279, 4280, 4351, 4361, 4366, 4709, 4778]

Assuming the complete set of Field IDs is meant to range from 1 to 4797, the other 75 missing IDs in the train set:

[ 7, 21, 48, 113, 266, 275, 349, 530, 1062, 1179, 1338, 1445, 1459, 1684, 1766, 1777, 1796, 1903, 1928, 1937, 2042, 2076, 2091, 2120, 2121, 2590, 2599, 2735, 2792, 2937, 2951, 2973, 3043, 3050, 3077, 3157, 3228, 3339, 3346, 3363, 3388, 3390, 3496, 3519, 3534, 3569, 3588, 3591, 3645, 3659, 3685, 3763, 3825, 3831, 3855, 3887, 3892, 3958, 4043, 4063, 4095, 4114, 4199, 4285, 4304, 4317, 4318, 4338, 4375, 4376, 4446, 4528, 4603, 4689, 4704]

hmm, also to add to the above, is it correct that most of the 'fields' are only a few pixels in size? When I was looking at the field_id tile it seems there's a lot of pixels with id 0 and for ids > 0 there are usually only around <10 pixels for each...?

Thanks Dave for bringing this out. I am little bit confused with

Thanks wwymak for this. This is another thing in data which was not very clear to me. Looking forward for someone in the leaderboard to kindly explain.

edited less than a minute later

We are looking into this and will share an update soon.

We have the correct number for fields in the vector layers of ground reference data, and that’s where we generated the field IDs list from (for both training and test). But some of the fields have a very narrow and long shape (while the area is large) and during rasterization they get no pixels. The width of the fields are less than a pixel of Sentinel-2. This caused the fields to disappear in the data shared with users. So our conclusion was to drop them from the list of train and test as they cannot be mapped to Sentinel-2 grid.

There are now 3,286 fields in the train and 1,402 fields in the test.

Thank you for bringing this to our attention and good luck with the challenge!