Primary competition visual

Barbados Lands and Surveys Plot Automation Challenge

Helping Barbados
$10 000 USD
Completed (5 months ago)
Computer Vision
Geospatial Data
Optical Character Recognition
895 joined
179 active
Starti
Aug 01, 25
Closei
Oct 19, 25
Reveali
Oct 20, 25
User avatar
Brainiac
1st Place Solution - Barbados Lands and Surveys Plot Automation Challenge
Notebooks ¡ 26 Nov 2025, 10:19 ¡ 7

Solution Overview

First, I want to thank the Barbados Lands and Surveys Department for this incredible real-world challenge, Zindi Africa for hosting the competition, and all the participants who made this such an exciting competition.

My approach combined Vision-Language Models (VLMs) with deep learning segmentation to extract both geometries and metadata from analog survey plans. The key innovation was using VLMs not just for OCR, but as reasoning engines to solve spatial alignment problems.

The solution has two parallel pipelines:

1. Segmentation Pipeline:

Geometry Alignment: The training data already contains geographic coordinate shapes of land parcels, but they need to be aligned to pixel space. I used Qwen3-VL-30B (32B) by providing it with both the full survey plan map and an image of the geo coords polygon shape—the model intelligently found the corresponding pixel locations of the parcel boundaries

Model: Unet++ with EfficientNet-B5 encoder

Novel Approach: Surveyor bias model - learned per-surveyor embeddings to capture individual naming conventions and geometry patterns

Inference: 8-way TTA with IoU-based ensemble selection (picked polygons with highest consistency across augmentations)

2. Text Extraction Pipeline

Model: Fine-tuned Qwen3-VL-8B using Unsloth LoRA

Image Patchification: Split each image into 7 overlapping crops (1024×1024) to improve VLM focus

Automated Label Correction: Used VLM to audit and fix noisy training labels—the model takes in the raw labels and image, then returns corrected full names, addresses, and other metadata details before fine-tuning

Key Techniques

• 8-way TTA with IoU-based ensemble selection (not simple averaging)

• Patchifying images into 7 overlapping crops focused VLM attention on relevant regions

• Douglas-Peucker smoothing on extracted polygons reduced noise while preserving shape

• Mixed precision training (16-bit) for efficient 2048×2048 segmentation

• Surveyor bias embeddings captured domain-specific patterns (middle name conventions, preferred land mappings...)

Git Code Repo - If you found this solution helpful, please consider giving it a ⭐ on GitHub!

Discussion 7 answers
User avatar
21db

Wow. Congratulations @Brainiac this is incredible. Thanks for sharing🙏

26 Nov 2025, 21:12
Upvotes 1
User avatar
Brainiac

Thank you @21db

User avatar
CodeJoe

@Brainiac just released his result. Tears of Joy😭.

I have never in my life heard of Douglas-Peucker.

And for this technique:

Geometry Alignment: The training data already contains geographic coordinate shapes of land parcels, but they need to be aligned to pixel space. I used Qwen3-VL-30B (32B) by providing it with both the full survey plan map and an image of the geo coords polygon shape—the model intelligently found the corresponding pixel locations of the parcel boundaries

How did you go by this? I am finding it confusing and really difficult to understand because I am like 'But How?"😂😂.

Honestly, Congratulations once again big man! Your name indeed depicts the win!

28 Nov 2025, 09:21
Upvotes 0
User avatar
Brainiac

@CodeJoe Thank you! I really appreciate the congratulations! 😄

For the geometry alignment part - Qwen3-VL (and other vision-language models) can return pixel coordinates/bounding boxes of objects in images.

How It Works:

You provide two inputs to the model:

  • Input 1: The cadastral survey map (PNG/JPG image) - this is in pixel space
  • Input 2: An image/visualization of the geographic coordinate polygon shape (the boundary in lat/lon or UTM coordinates)

The model can then reason and visually match the shape from the geo coords to the corresponding shape on the cadastral map, extracting the exact pixel boundaries or corners of the parcel of land.

The geographic coordinates already define the exact land parcel we're interested in - just in a different coordinate system. By providing both the cadastral map and the geo shape visualization, Qwen can infer the exact pixel locations through visual correspondence.

User avatar
CodeJoe

Wow, so there wasn't any need to label. Just Wow. I am astonished!

User avatar
Brainiac

Yeah, Qwen vlms are quite powerful

User avatar
CodeJoe

Thank You SO MUCH for sharing! Truly grateful big man