CGIAR Crop Yield Prediction Challenge
Can you predict maize yields on East African farms using satellite data?
21 October 2020—7 February 2021
Climate data broken?
published 19 Nov 2020, 13:23

Dear Zindi Team,

It seems to me that the climate data you provided is broken. All pixel values in all month are the same in each Field_ID (only for the *_CLIM_* data not the *_S2_*). The individual imaged for each fields have different mean values but there is no spatial or temporal variability within each image. Can you please confirm that this is correct? Looking at the data source provided in ImageBands.docx (TerraClimate), it seems to me that there should be different values for each month.

Thanks for clarifying,


Hi, the reason why you may have the same values for all pixels of the same field for a particular month is that TerraClimate data is of a coarser resolution (5 kilometer if i am not wrong) than Sentinel-2 bands (10m). So bioclimate data were resampled at the same resolution as the Sentinel-2 bands for this competition.

Yes, that might be true for the spatial information (not clear why Zindi provides it on the same resolution like Sentinel2 if it is all the same value anyways?!). However, I also do not see any variation in time (over the different months) and that should not happen to my understanding of the data source. Can you check with your data? I might have a bug in my code. Checked it twice, so I wondered whether I am alone with this observation or whether there is an issue with the data preprocessing on Zindi's side.

I'll look into this. We'd expect the same value across all pixels in a field for a given date (thanks to the low resolution climate data) but it should change over time.

Hi John, any news on that issue?

Sorry about the delay. I'm hoping that by the end of the week we'll have the weather data and possibly some extra data shared. I'll post here and we'll also likely do an announcement when the changes are made.

Cool, thanks! Looking forward to it.

New data is now up in the Data section. There is now climate data from all years for all fields, and some extra soil data in as a bonus. Thanks again for noticing this

Thanks, I had a look at the data. Great to get some more information on soil data! Concerning the climate data: it is good to have now all data. Unfortunately, we cannot know (at least I haven't seen how) when the harvest in the respective fields was obtained. To process the data in a meaningful way, we would either need to know in which year the harvest for each Field_ID took place or you could crop the climate data this way to provide to us only the weather for the year when the harvest took place (we might not need to know which year it is then - the same way it was provided previously in the zip files). Thanks again, Zindi Team!

@Johnowithaker: Any updates on the date issue? Would be greatly appreciated!

Sorry, just figured that the date is given for the train set. However, we have no access on the year in the test set. Without this information for the test set, we cannot process the climate data in a meaningful way. Or am I missing someting?! If you added the year to SampleSubmission.csv I think it should fix that problem.

Apologies. You are correct - I didn't realize the test years weren't available. I've been on leave, and it might be a few more days before the files are available in the Data section since so are most of the Zindi team, but here are the test field IDs with the year as a google drive link:

I'll update here when the files have been uploaded. Thanks again!