📊 This Week on Zindi: Barbados Lands and Surveys Sol...

Barbados Lands and Surveys Plot Automation Challenge

Helping Barbados

$10 000 USD

Under code review

Skills you will learn

Computer Vision

Geospatial Data

Optical Character Recognition

854 joined

179 active

Info Data Chat Leaderboard

Start

Aug 01, 25

Oct 19, 25

Reveal

Oct 20, 25

Joseph_gitau

African center for data science and analytics

Barbados Lands and Surveys Solution (2rd place unofficial)

Notebooks · 29 Oct 2025, 17:59 · 4

Barbados Land Detection and OCR Project - Solution Summary

Code: https://github.com/josephgitau/Barbados-Lands-and-Surveys-Plot-Automation-Challenge/tree/main

Platform

Dual-notebook workflow executed on Colab GPUs (L4 for training, A100 High Ram for inference).
Total pipeline execution time: ~6.5 hours.

Data Aspects

Source: cadastral survey map datasets.
Format: JPEG/PNG images with COCO-style annotation JSON (results.json).
Annotation: Used Label Studio for polygon mask annotation on 658 images.

Preprocessing and Post-processing

Image Preprocessing: Resizing to 1024x1024, encoder-specific normalization (Albumentations), and data augmentation (flips, rotations, color perturbations).
Post-processing: Polygons cleaned with steps like removing self-intersections, simplifying boundaries (RDP = 0.0025), and smoothing near-straight lines.

Model Architecture

Segmentation: UNet++ with an EfficientNet-B7 encoder.
OCR: Large Vision-Language Model Qwen3-VL-30B-A3B-Instruct-FP8 (used via vLLM runtime).
Training: PyTorch Lightning, Adam optimizer, Mixed Precision (AMP 16).
Loss Function: Combination of Boundary Loss, Focal Loss, and optional Dice/BCE.

Insight

The choice of the large, pretrained Qwen3-VL-30B for OCR was intentional to leverage its strong zero-shot generalization and avoid the risks of overfitting and memorization associated with Supervised Fine-Tuning (SFT) on noisy data.
The segmentation and OCR models combined achieved a Private Leaderboard Score of 0.970242006.
The dual-pipeline delivers reproducible outputs (barbados_final.csv) combining cleaned polygon geometries and OCR-extracted text.

Final Comments

The dataset was flawed from the start (you can check the data cleanup notebook for all stats), communication was an issue and rules were flawed in this competiotion. Despite these hurdles, the competition was an invaluable learning opportunity. The skills we have attained are priceless, and that matter more.

See you in the next one.

Discussion 4 answers

CodeJoe

Congrats Big man, and thank you for sharing your code. Time to dive in and learn!

29 Oct 2025, 18:12

Upvotes 1

Bone

Thank you for sharing your code. Your cemments during the competition were very helpful. Every completion is indeed a pathway to learning. Hope to learn from you again in future competitions. Congrats!

29 Oct 2025, 19:49

Upvotes 1

Congratulations Joseph_gitau and thank you for sharing your insights during this competition. It seems that your risky decision to use manual labels was more reasonable than not using them, at least you got to submit your solution to run on the hidden test set. See you in the next competitions!

30 Oct 2025, 02:09

Upvotes 2

ZaakciiRu

Apparently, you worked on the software creation of masks?

replied to 3B31 Oct 2025, 09:41

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status