Primary competition visual

Amini Cocoa Contamination Challenge

Helping Ghana
$7 000 USD
Completed (11 months ago)
Computer Vision
Object Detection
928 joined
255 active
Starti
Feb 14, 25
Closei
May 11, 25
Reveali
May 12, 25
User avatar
Koleshjr
Multimedia university of kenya
🥉 3rd Place Solution
Platform · 28 May 2025, 12:31 · 13

Team: Brain Blend

By @koleshjr & @Sodiq_Babawale_

The Amini Cocoa Contamination Challenge tasked participants with developing machine learning models capable of identifying multiple plant diseases from images of cocoa leaves. But there was a catch: these models needed to run on low-resource smartphones typically used by subsistence farmers in Africa without sacrificing accuracy.

As Team Brain Blend, we built a lightweight, robust pipeline using YOLO11s and ranked 3rd overall. Here's how we did it 👇

🧠 Challenge Overview

  • Goal: Detect all visible diseases on cocoa leaves.
  • Core constraints: Must generalize to unseen disease types. Must perform inference on low-resource smartphones.

The challenge was a blend of computer vision, model efficiency, and practical deployment something we were genuinely excited to tackle.

🔄 ETL & Data Pipeline

We started by organizing the provided dataset using a stratified cross-validation approach:

  • 10-fold StratifiedGroupKFold to preserve class distribution and image-level grouping.
  • For each fold, annotations and images were formatted into YOLO-compatible folder structures:images/train, images/val, labels/train, labels/val

This ensured each training round had a diverse and balanced dataset.

🧠 Modeling Approach

🧬 Model Choice: YOLO11s

We chose YOLO11s due to its speed, performance, and compatibility with edge devices. Its tiny size allowed us to meet the deployment constraint without compromising much on accuracy.

🔁 Training Strategy

We trained on folds 6, 7, and 8 of the dataset. For each fold:

  • Validation was split: One half merged with training data. The other half used for model evaluation.

Each fold was trained for ~2 hours and 30 minutes, keeping us well within the 9-hour limit.

🧪 Inference Strategy

🧠 Ensemble Learning

We created an ensemble using Weighted Box Fusion (WBF) to combine predictions from all three models, improving robustness and detection confidence.

📐 Multi-Scale Inference

We performed inference across multiple image sizes to enhance generalization to unseen disease patterns:

[640, 800, 960, 1120, 1280, 1440]
💡 Special thanks to @kiminya for inspiring the multi-scale strategy.

⏱ Runtime Summary

Training 8h 33min

Inference 40min

This made our solution fully compliant with the challenge's time constraints (≤9h training, ≤3h inference).

🧩 Model Interpretability

To understand what our models were actually looking at, we implemented EigenCAM:

  • Extracts the first principal component of feature maps.
  • Helps visualize attention regions related to disease classification.

⚙️ Github Repo

koleshjr/Amini-Cocoa-Contamination-Challenge: Can you develop a mobile-friendly machine learning model to identify diseases on cocoa leaves?

Discussion 13 answers
User avatar
Sodiq_Babawale_
University of ibadan

Welldone @koleshjr. This is detailed. It was nice learning from you.

28 May 2025, 12:38
Upvotes 1
User avatar
Koleshjr
Multimedia university of kenya

It was nice collaborating with you@Sodiq_Babawale_

User avatar
CodeJoe

Woah, we did similar things🔥. Thanks for sharing @Koleshjr @Sodiq_Babawale_

28 May 2025, 12:42
Upvotes 0
User avatar
Koleshjr
Multimedia university of kenya

Yeah pretty much the same thing 😅, thanks

User avatar
CodeJoe

🔥🔥

User avatar
RareGem

Congratulations to Brain Blend team and thank you for sharing your solutions

28 May 2025, 12:54
Upvotes 0
User avatar
nymfree

Congrats guys. I also did the same thing with yolo11s, except that I trained 5 folds at a resolution of 448.

One important detail I noticed in your inference is that you set

max_det=600 

i.e., double the default of 300. Was this critical to final score? I noticed towards the end that subs with more detections scored a little better, but didn't pursue it further.

28 May 2025, 13:03
Upvotes 0
User avatar
Koleshjr
Multimedia university of kenya

Not really , using the default leads to similar or slightly worse results.

User avatar
CodeJoe

In mine, it gives slightly worse results too. I feel the more the merrier works here.

User avatar
Jaw22
Zindi africa

@koleshjr Congratulations, your solution is insightful and thank you for sharing.

28 May 2025, 13:40
Upvotes 0

Thank you for sharing your detailed solution @koleshjr.

If I may ask, why did you choose folds 6,7,8 instead of other folds?

28 May 2025, 13:45
Upvotes 0
User avatar
Koleshjr
Multimedia university of kenya

We were doing subsets of 3s since that is what was fitting in the set time limit, and that combination had the best local val scores and lb was not bad as well

Well done + thanks for sharing

28 May 2025, 13:45
Upvotes 0