CGIAR Wheat Growth Stage Challenge by CGIAR Platform for Big Data in Agriculture
$3,000 USD
Predict which phase of growth wheat crops are in using photos taken by farmers in India
459 data scientists enrolled, 204 on the leaderboard
AgricultureComputer VisionPredictionImageSDG2
28 August—4 October
38 days
1st Place Solution
published 5 Oct 2020, 02:33
edited ~6 hours later

Thanks Zindi, organizers and all the participants for this nice competition!

I was, probably, a bit luckier to obtain the first position at the Private LB. You could find my solution description and source code here:


Congratulations Yauhen, thank you for sharing your approach.

Congratulations and thank you for sharing. :)

Congratulations! What was the best single model score you got?

Best single model (average of 5 folds on the test set) achieves 0.40327 RMSE on Private LB. It was ResNet101

Congratulations ... thank you for sharing your approach.

Congrats, i have one is fine to use ensemble when we want to increase the accuracy but during the time of deployment three model will take lots of memory and decrease the latency overall so its feasible or not?

Thank you

It depends on the use-case you have. If you're interested in fast real-time predictions, then, of course, large ensembles is not a way to go. However, if you don't have strict time limits, but interested in the highest quality possible, then ensembles could help.

yeah you're right in deployment I don't think we'll want to have 3 models having to make predictions first before inference can me made as that will increase latency.

Congratulations and thank your for sharing!

thanks a lot for sharing your solution so quickly! Learning a ton!!

Congrats and good Job !

Thanks for this. Its highly appreciated.

congrats and is it ok to use good quality images only for classiication ?

Beacause, our problem statement is to predict the 7 growth stages of the wheat right.

Thank you

Yes, that's right, but Data tab told that: `All the test images have reliable labels`. It means that it's better to train on good quality labels to obtain a better performance on the test dataset.

Hi @bes, thank you a lot for your sharing. I don't know if you have one but I hope, please can you also share or explain your methodology about challenging ? How you select models? how you evaluate them? How you make hypothesis to improve models ? Do you build a simple sheet to enumarate things that going well and no ? ...

Hi @eaedk! It really depends on the problem type.

But usually I begin with the exploratory data analysis part followed by hypothesis generation, establishing local validation, and trying different ideas and approaches. I create a document where I store all these thoughts, assumptions, papers, and resources that, I think, could work for this particular problem. Sometimes such a document can even contain up to 20 pages of text. Then I have a list of priorities and try one idea after another with the higher priority (and save all the results).

hi @bes I had another doubt ..How did you determine the predicted probabilities are multiplied by the class labels and summed up?

I guess, it's just a natural idea. We have 5 classes in the good labels: 2,3,4,5,7. Classification gives us the softmaxed probabilites for each class. So, we could take a weighted average of the labels to obtain the final prediction.