After nearly three months of hard work, experimenting with different algorithms, testing model variations, and carefully examining every aspect of the dataset, I’ve reached a very frustrating realization. It seems that the competition submissions are being validated against incorrect human-labeled data — in some cases, the errors appear to be completely random.For example, image BYLS-107, which also appears more than 50 times across the training set, has a completely wrong mask. This is not a borderline labeling issue — it’s simply incorrect. What’s more concerning is that there has been no official acknowledgment or clarification from the organizers about data validation steps, or whether similar issues exist in the public or private test sets.As participants, we’ve all invested significant time fine-tuning our pipelines, optimizing architectures, and exploring creative approaches to improve generalization. But if the evaluation is based on mislabeled or inconsistent ground truth, then the outcome is largely a matter of luck — not modeling skill or data understanding. Models that happen to mimic the labeling errors might rank higher, while those that genuinely learn meaningful patterns could perform worse.This undermines the spirit of the competition and the credibility of its results. I genuinely hope the organizers can address this issue, provide clarification on data quality and verification, and let us know whether the same labeling problems exist in the test data used for scoring.Because without reliable ground truth, machine learning and computer vision approaches become meaningless — no model can learn from noise.
Mohammed most of them are not wrong. It is the same land with different site plans likely due to valuation purposes (I can lease a land from my land even though it is the same land- valuation is done differently for the leased land meaning different site plan.). The recent ones are most likely the ones found in the images. Amy also confirmed that there is no mistake in the test set so let's take her for a word.
Good luck big man!
did you check the original polygon for BYLS-107 and plot it with the original image you will see the mask is not aligned it's almost 90 degree rotation . base don my understanding the polygon and mask should be overlapped
I did for almost all and was confused at first. But when I visualized it on the map it actually made sense. All those polygons fit in the Bare land perfectly. I then realized most of the polygons were from different site plans.
There might be one or two I overlooked but it is not really significant to drop your score to low.
Have you tried flipping the y-axis again before visualizing?
it's not all about flipping.
I did inference on all training set images and there are those samples you can never get right even though my model is ok.
i totally agree with you because i did the same thing i infere on training data and the mask perfectly fit the polygons but evaluating the iou between predicted masks and ground truth gives you low iou
Also this second plot shows how real polygon masks are imperfect if you cross check with actual images you will note all the different irregularites present.
thank you @Joseph_gitau for you info and confirmation for the same point i mentioned may i ask you if you don't mind plot BYLS-107 specially because this one has totally wrong mask just confirmation thank you again
I see, this is really different.
Well if Amy says everything is correct, we just have to take her words for it. I mean like people will not be in 97and 96 polygon scores
They would be even with wrong polygons
I think the issue stems from using the image pixels to get the gt polygons, note that the polygons provided are in geo coords, so there will be some small disrepancies - as it is not a 1 to 1 mapping(pixel poly vs geo coords poly)- this could explain the 0.95 iou, but never 1. but if you use bearings and distances to get the gt polygons, most of the polygons are correct, though there is still a number of images with wrong polgons.
This are all my assumptions and made up theories 😂
This is what I have
Exactly, this was my point @Brainiac.
This is where domain knowledge could have worked wonders
Hi @Brainiac i spent all the day working on bearing and distances to get the gt masks i use affine transfoermation and georefernce points but same issue i know this is the tight time but can you explain a methodology or piepline to use bearing and distance to get exact gt polygons for images with correct shapes , thanks
Hi @Mohamed_abdelrazik — totally get where you’re coming from. I went down the same rabbit hole with cadastral maps; it took me a while (papers, articles, YouTube deep dives) to really grasp how bearings/distances tie back to ground truth. It’s a new domain, so one day isn’t realistic—don’t be hard on yourself.
Here’s the pipeline I use for bearings + distances:
Example image:
raise ValueError("Empty coords.")ax.set_aspect("equal", adjustable="box")ax.set_xlabel("Easting")ax.set_ylabel("Northing")coords_en, title=f"Converted EN ({'Back' if USE_BACK else 'Forward'} bearings)"@Brainiac Thank you Big man🙇
@Brainiac THank you so much for your time and help bro