📚 This Week on Zindi: 🥉 3rd Place Solution

Amini Cocoa Contamination Challenge

Helping Ghana

$7 000 USD

Completed (~1 year ago)

Skills you will learn

Computer Vision

Object Detection

932 joined

255 active

Info Data Chat Leaderboard

Start

Feb 14, 25

May 11, 25

Reveal

May 12, 25

Koleshjr

Multimedia university of kenya

🥉 3rd Place Solution

Platform · 28 May 2025, 12:31 · 13

Team: Brain Blend

By @koleshjr & @Sodiq_Babawale_

The Amini Cocoa Contamination Challenge tasked participants with developing machine learning models capable of identifying multiple plant diseases from images of cocoa leaves. But there was a catch: these models needed to run on low-resource smartphones typically used by subsistence farmers in Africa without sacrificing accuracy.

As Team Brain Blend, we built a lightweight, robust pipeline using YOLO11s and ranked 3rd overall. Here's how we did it 👇

🧠 Challenge Overview

Goal: Detect all visible diseases on cocoa leaves.
Core constraints: Must generalize to unseen disease types. Must perform inference on low-resource smartphones.

The challenge was a blend of computer vision, model efficiency, and practical deployment something we were genuinely excited to tackle.

🔄 ETL & Data Pipeline

We started by organizing the provided dataset using a stratified cross-validation approach:

10-fold StratifiedGroupKFold to preserve class distribution and image-level grouping.
For each fold, annotations and images were formatted into YOLO-compatible folder structures:images/train, images/val, labels/train, labels/val

This ensured each training round had a diverse and balanced dataset.

🧠 Modeling Approach

🧬 Model Choice: YOLO11s

We chose YOLO11s due to its speed, performance, and compatibility with edge devices. Its tiny size allowed us to meet the deployment constraint without compromising much on accuracy.

🔁 Training Strategy

We trained on folds 6, 7, and 8 of the dataset. For each fold:

Validation was split: One half merged with training data. The other half used for model evaluation.

Each fold was trained for ~2 hours and 30 minutes, keeping us well within the 9-hour limit.

🧪 Inference Strategy

🧠 Ensemble Learning

We created an ensemble using Weighted Box Fusion (WBF) to combine predictions from all three models, improving robustness and detection confidence.

📐 Multi-Scale Inference

We performed inference across multiple image sizes to enhance generalization to unseen disease patterns:

[640, 800, 960, 1120, 1280, 1440]

💡 Special thanks to @kiminya for inspiring the multi-scale strategy.

⏱ Runtime Summary

Training 8h 33min

Inference 40min

This made our solution fully compliant with the challenge's time constraints (≤9h training, ≤3h inference).

🧩 Model Interpretability

To understand what our models were actually looking at, we implemented EigenCAM:

Extracts the first principal component of feature maps.
Helps visualize attention regions related to disease classification.

⚙️ Github Repo

koleshjr/Amini-Cocoa-Contamination-Challenge: Can you develop a mobile-friendly machine learning model to identify diseases on cocoa leaves?

Discussion 13 answers

Sodiq_Babawale_

University of ibadan

Welldone @koleshjr. This is detailed. It was nice learning from you.

28 May 2025, 12:38

Upvotes 1

Koleshjr

Multimedia university of kenya

It was nice collaborating with you@Sodiq_Babawale_

replied to Sodiq_Babawale_28 May 2025, 12:42

Upvotes 0

CodeJoe

Woah, we did similar things🔥. Thanks for sharing @Koleshjr @Sodiq_Babawale_

28 May 2025, 12:42

Upvotes 0

Koleshjr

Multimedia university of kenya

Yeah pretty much the same thing 😅, thanks

replied to CodeJoe28 May 2025, 12:52

Upvotes 1

CodeJoe

🔥🔥

replied to Koleshjr28 May 2025, 13:35

Upvotes 0

RareGem

Congratulations to Brain Blend team and thank you for sharing your solutions

28 May 2025, 12:54

Upvotes 0

nymfree

Congrats guys. I also did the same thing with yolo11s, except that I trained 5 folds at a resolution of 448.

One important detail I noticed in your inference is that you set

max_det=600

i.e., double the default of 300. Was this critical to final score? I noticed towards the end that subs with more detections scored a little better, but didn't pursue it further.

28 May 2025, 13:03

Upvotes 0

Koleshjr

Multimedia university of kenya

Not really , using the default leads to similar or slightly worse results.

replied to nymfree28 May 2025, 13:05

Upvotes 0

CodeJoe

In mine, it gives slightly worse results too. I feel the more the merrier works here.

replied to nymfree28 May 2025, 13:35

Upvotes 0

Jaw22

Zindi africa

@koleshjr Congratulations, your solution is insightful and thank you for sharing.

28 May 2025, 13:40

Upvotes 0

Ifeadewumi

Thank you for sharing your detailed solution @koleshjr.

If I may ask, why did you choose folds 6,7,8 instead of other folds?

28 May 2025, 13:45

Upvotes 0

Koleshjr

Multimedia university of kenya

We were doing subsets of 3s since that is what was fitting in the set time limit, and that combination had the best local val scores and lb was not bad as well

replied to Ifeadewumi28 May 2025, 13:48

Upvotes 1

Ran_don

Well done + thanks for sharing

28 May 2025, 13:45

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status