ICLR Workshop Challenge #1: CGIAR Computer Vision for Crop Disease
\$5,000 USD
Identify wheat rust in images from Ethiopia and Tanzania, and win a trip to present your work at ICLR 2020 in Addis Ababa.
820 data scientists enrolled, 306 on the leaderboard
29 January—29 March
Exploring metric and public test dataset properties
published 6 Mar 2020, 07:48
edited 2 minutes later

The metric is nice and simple, so we can get some info with a few submissions.

Suppose we have all rows being equal to [a1, a2, a3] (with the sum=1) and [r1, r2, r3] are the ratios of 'leaf_rust', 'stem_rust', 'healthy_wheat' classes in the public test dataset. Then the score is –(r1*log(a1) + r2*log(a2) + r3*log(a3)).

With three different submissions ([a1, a2, a3] are different) we can get r1, r2, r3 by solving a linear system.

It turns out that r1=0.535714…, r2=0.303571…, r3=0.160714... In fact, that’s near the original train distribution of classes. That’s good!

Knowing [r1, r2, r3] we can maximize the (public) score with constant columns. And that’s [a1, a2, a3]=[r1, r2, r3]. This gives the score 0.99.