Primary competition visual

Barbados Lands and Surveys Plot Automation Challenge

Helping Barbados
$10 000 USD
Under code review
Computer Vision
Geospatial Data
Optical Character Recognition
854 joined
179 active
Starti
Aug 01, 25
Closei
Oct 19, 25
Reveali
Oct 20, 25
User avatar
Joseph_gitau
African center for data science and analytics
Barbados Lands and Surveys Solution (2rd place unofficial)
Notebooks · 29 Oct 2025, 17:59 · 4

Barbados Land Detection and OCR Project - Solution Summary

Code: https://github.com/josephgitau/Barbados-Lands-and-Surveys-Plot-Automation-Challenge/tree/main

Platform

  • Dual-notebook workflow executed on Colab GPUs (L4 for training, A100 High Ram for inference).
  • Total pipeline execution time: ~6.5 hours.

Data Aspects

  • Source: cadastral survey map datasets.
  • Format: JPEG/PNG images with COCO-style annotation JSON (results.json).
  • Annotation: Used Label Studio for polygon mask annotation on 658 images.

Preprocessing and Post-processing

  • Image Preprocessing: Resizing to 1024x1024, encoder-specific normalization (Albumentations), and data augmentation (flips, rotations, color perturbations).
  • Post-processing: Polygons cleaned with steps like removing self-intersections, simplifying boundaries (RDP = 0.0025), and smoothing near-straight lines.

Model Architecture

  • Segmentation: UNet++ with an EfficientNet-B7 encoder.
  • OCR: Large Vision-Language Model Qwen3-VL-30B-A3B-Instruct-FP8 (used via vLLM runtime).
  • Training: PyTorch Lightning, Adam optimizer, Mixed Precision (AMP 16).
  • Loss Function: Combination of Boundary Loss, Focal Loss, and optional Dice/BCE.

Insight

  • The choice of the large, pretrained Qwen3-VL-30B for OCR was intentional to leverage its strong zero-shot generalization and avoid the risks of overfitting and memorization associated with Supervised Fine-Tuning (SFT) on noisy data.
  • The segmentation and OCR models combined achieved a Private Leaderboard Score of 0.970242006.
  • The dual-pipeline delivers reproducible outputs (barbados_final.csv) combining cleaned polygon geometries and OCR-extracted text.

Final Comments

The dataset was flawed from the start (you can check the data cleanup notebook for all stats), communication was an issue and rules were flawed in this competiotion. Despite these hurdles, the competition was an invaluable learning opportunity. The skills we have attained are priceless, and that matter more.

See you in the next one.

Discussion 4 answers
User avatar
CodeJoe

Congrats Big man, and thank you for sharing your code. Time to dive in and learn!

29 Oct 2025, 18:12
Upvotes 1

Thank you for sharing your code. Your cemments during the competition were very helpful. Every completion is indeed a pathway to learning. Hope to learn from you again in future competitions. Congrats!

29 Oct 2025, 19:49
Upvotes 1
User avatar
3B

Congratulations Joseph_gitau and thank you for sharing your insights during this competition. It seems that your risky decision to use manual labels was more reasonable than not using them, at least you got to submit your solution to run on the hidden test set. See you in the next competitions!

30 Oct 2025, 02:09
Upvotes 2

Apparently, you worked on the software creation of masks?