📚 Trending Now: 2nd place solution

Amini Cocoa Contamination Challenge

Helping Ghana

$7 000 USD

Completed (~1 year ago)

Skills you will learn

Computer Vision

Object Detection

932 joined

255 active

Info Data Chat Leaderboard

Start

Feb 14, 25

May 11, 25

Reveal

May 12, 25

stefan027

2nd place solution

Notebooks · 28 May 2025, 08:07 · 5

Our solution (Team Neural Beans with @100i and me) is conceptually simple - just a single object detection model. We fine-tuned a DINO model with a Swin Transformer-base backbone using the MMDetection library.

There are two main pre-trained versions of this model in mmdet: a version with a ResNet50 backbone and 4 scales of feature maps (DINO-4scale-R-50), and a more performant version with a Swin-Large backbone and 5 scales of feature maps (DINO-5scale-Swin-L). The DINO-5scale-Swin-L model is too big and slow given the resource restrictions of this challenge. We performed experiments with different backbones (including ConvNext (Tiny and Small), Swin (Small, Base and Large), and SwinV2 (Base)), 4 and 5 features scales, and different image sizes. Our best combination uses a Swin-Base backbone, 4 features scales and square 640x640 images.

Our training pipeline includes random horizontal and vertical flips, colour variations (using mmdet's YOLOXHSVRandomAug augmentation), and different image scales. Experiments with mosaic and mixup didn't improve the model. We utilised Exponential Moving Average (EMA) of weights during training which improved validation performance.

The model was trained for 12 epochs with a learning rate of 0.0001, with linear warmup over the first epoch, and cosine annealing beginning after the 6th epoch. The model was trained with mixed precision to reduce GPU memory usage.

Resources:

Code repo: https://github.com/stefan027/zindi-competitions/tree/main/amini_cocoa_contamination
Kaggle training and inference notebook: https://www.kaggle.com/code/stefan87/01-cocoa-contamination-dino-swin-b-4-scale/notebook?scriptVersionId=237234089
Kaggle inference-only notebook: https://www.kaggle.com/code/stefan87/04-cocoa-contamination-inference
Colab inference-only notebook: https://colab.research.google.com/drive/1K1mABa9Xe4NBGjOb9hF2DFhGN9yqCu23?usp=sharing

Discussion 5 answers

CodeJoe

Every time I come across that library, it's like seeing stars- very difficult to understand. The expertise behind it is undeniable. A huge congratulations to you guys.

28 May 2025, 08:30

Upvotes 2

stefan027

Yeah that library definitely needs some maintenance because it's getting harder and harder to manage the dependencies, especially in environments like Kaggle and Colab

replied to CodeJoe28 May 2025, 08:41

Upvotes 1

CodeJoe

Very true, Anyways great work, really learnt from your solution.

replied to stefan02728 May 2025, 09:22

Upvotes 1

analyst

How do you manage slow inference speed when using mmdetection

28 May 2025, 11:34

Upvotes 0

stefan027

Thanks for the question. We had no problems with inference speed, so it is not something we spent much time thinking about. A few points:

mmdetection is not a model, it's just a toolbox. I don't think mmdetection is slow, but some model implementations are slow. There are many different object detection models implemented, some of which are big and slow, while others are designed for real-time inference.
The DINO model that we used is not intended for real-time inference (i.e., for video), but we didn't need real-time inferencing for this challenge.
I just checked the inference speed of our model using mmdet (i.e., not optimised for inference) on a Kaggle T4 instance, and it takes only approx. 500ms for inference on a single image.
The competition's inference time restriction of 3 hours was extremely generous. Even with TTA, we ran batch inference on all test images in about 12 minutes on a T4 GPU.
Models trained with mmdet can be optimised for inference. There is a library called mmdeploy just for that. We show how to convert our model to ONNX in this notebook. As with all the open-mmlab libraries, getting everything set up to work correctly can be tricky though.

replied to analyst28 May 2025, 12:11

Upvotes 2

Join the largest network for
data scientists and AI builders

About FAQs

Status