📊 AI in Focus: 1st Place Solution - Barbados ...

Barbados Lands and Surveys Plot Automation Challenge

Helping Barbados

$10 000 USD

Completed (8 months ago)

Skills you will learn

Computer Vision

Geospatial Data

Optical Character Recognition

903 joined

179 active

Info Data Chat Leaderboard

Start

Aug 01, 25

Oct 19, 25

Reveal

Oct 20, 25

Brainiac

1st Place Solution - Barbados Lands and Surveys Plot Automation Challenge

Notebooks · 26 Nov 2025, 10:19 · 7

Solution Overview

First, I want to thank the Barbados Lands and Surveys Department for this incredible real-world challenge, Zindi Africa for hosting the competition, and all the participants who made this such an exciting competition.

My approach combined Vision-Language Models (VLMs) with deep learning segmentation to extract both geometries and metadata from analog survey plans. The key innovation was using VLMs not just for OCR, but as reasoning engines to solve spatial alignment problems.

The solution has two parallel pipelines:

1. Segmentation Pipeline:

Geometry Alignment: The training data already contains geographic coordinate shapes of land parcels, but they need to be aligned to pixel space. I used Qwen3-VL-30B (32B) by providing it with both the full survey plan map and an image of the geo coords polygon shape—the model intelligently found the corresponding pixel locations of the parcel boundaries

Model: Unet++ with EfficientNet-B5 encoder

Novel Approach: Surveyor bias model - learned per-surveyor embeddings to capture individual naming conventions and geometry patterns

Inference: 8-way TTA with IoU-based ensemble selection (picked polygons with highest consistency across augmentations)

2. Text Extraction Pipeline

Model: Fine-tuned Qwen3-VL-8B using Unsloth LoRA

Image Patchification: Split each image into 7 overlapping crops (1024×1024) to improve VLM focus

Automated Label Correction: Used VLM to audit and fix noisy training labels—the model takes in the raw labels and image, then returns corrected full names, addresses, and other metadata details before fine-tuning

Key Techniques

• 8-way TTA with IoU-based ensemble selection (not simple averaging)

• Patchifying images into 7 overlapping crops focused VLM attention on relevant regions

• Douglas-Peucker smoothing on extracted polygons reduced noise while preserving shape

• Mixed precision training (16-bit) for efficient 2048×2048 segmentation

• Surveyor bias embeddings captured domain-specific patterns (middle name conventions, preferred land mappings...)

Git Code Repo - If you found this solution helpful, please consider giving it a ⭐ on GitHub!

Discussion 7 answers

21db

Wow. Congratulations @Brainiac this is incredible. Thanks for sharing🙏

26 Nov 2025, 21:12

Upvotes 1

Brainiac

Thank you @21db

replied to 21db28 Nov 2025, 06:03

Upvotes 1

CodeJoe

@Brainiac just released his result. Tears of Joy😭.

I have never in my life heard of Douglas-Peucker.

And for this technique:

Geometry Alignment: The training data already contains geographic coordinate shapes of land parcels, but they need to be aligned to pixel space. I used Qwen3-VL-30B (32B) by providing it with both the full survey plan map and an image of the geo coords polygon shape—the model intelligently found the corresponding pixel locations of the parcel boundaries

How did you go by this? I am finding it confusing and really difficult to understand because I am like 'But How?"😂😂.

Honestly, Congratulations once again big man! Your name indeed depicts the win!

28 Nov 2025, 09:21

Upvotes 0

Brainiac

@CodeJoe Thank you! I really appreciate the congratulations! 😄

For the geometry alignment part - Qwen3-VL (and other vision-language models) can return pixel coordinates/bounding boxes of objects in images.

How It Works:

You provide two inputs to the model:

Input 1: The cadastral survey map (PNG/JPG image) - this is in pixel space
Input 2: An image/visualization of the geographic coordinate polygon shape (the boundary in lat/lon or UTM coordinates)

The model can then reason and visually match the shape from the geo coords to the corresponding shape on the cadastral map, extracting the exact pixel boundaries or corners of the parcel of land.

The geographic coordinates already define the exact land parcel we're interested in - just in a different coordinate system. By providing both the cadastral map and the geo shape visualization, Qwen can infer the exact pixel locations through visual correspondence.

replied to CodeJoe28 Nov 2025, 09:49

Upvotes 1

CodeJoe

Wow, so there wasn't any need to label. Just Wow. I am astonished!

replied to Brainiac28 Nov 2025, 09:54

Upvotes 1

Brainiac

Yeah, Qwen vlms are quite powerful

replied to CodeJoe28 Nov 2025, 09:58

Upvotes 1

CodeJoe

Thank You SO MUCH for sharing! Truly grateful big man

replied to Brainiac28 Nov 2025, 10:09

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status