if you have good rescources train a yolo with image size of 1024 for almost 100 epochs and i can certainly get you under 0.80 and all without usingtabular data .
Don't be confused—this image is simply the result of an extremely poor YOLO model. Due to limited resources, I could only train it for one epoch, so you can imagine how bad it turned out. However, with proper training, it can certainly achieve under 0.80—call it a hunch! (or my past experince)
the polygon annotations are bad. won't work out of the box. yolo will detect 6 panels as one. will have to do lots of post processing to get it right. my yolo attempt scored 2.
If you can add one more regressor head to calculate no of panels/boiler in the detected bounding box, I think it should work.
Totaly agree with @nymfree I won't advise anyone to put their last efforts in a such approach , I have already gone this way using yolo model the best score I can have was 1.92 .
I think yolo can work perfectly. It can differentiate panel and boil without error. The problem is for images with too many panels (200 foor ex), there is only one big polygon. The solution is to manually annotate those images or to split the polygone equally with the number of panels. So the score is bad only due the the big difference between 1 and 200 for example
https://zindi.africa/competitions/lacuna-solar-survey-challenge/discussions/25780