Preprocessing:
- Delete images without annotations
- Delete mal annoted images (3 images). bbox is [x,y,w,h], if the area of the bbox which is a rectangle is null so w * h = 0 we have problem.
- Split the dataset (80% for the train and 20% for the validation)
Training:
- Used a Faster RCNN model with a resnet152 backbone. Treated the problem as an object detection task
-Trained for 2 epochs, batch size of 4, did not resize the images, SGD optimizer, 0.001 lr
- Model trained for about 5 hours
Post-Processing:
Post processing is very key to this competetion. I tried several confidence scores but I finally settled with 0.83 threshold.
Final Comment:
Faster RCNN models did better than the YOLO models. Post-processing is very vital to a successful model.
Thank you
Nice - funny, I could train RCNN on my GPU but not infer with it, so had to abandon it.
You say train for 2 epochs? That does not sound right ...
Thanks,
I used the free Kaggle GPU tier for the entire exercise.
Yes I trained for 2 epochs. The model is huge (it has 152 layers) and several millions of parameters. The trained data is about 250mb. Anything beyond two epochs (fine-tuning the last 5 layers), the model will over-fit.
Wow!!!! I thought 2 was a typo ...
This one for me ... between keras and yolo ... felt like taking a knife to a gun fight. But I'll definitely use it again, both of them. I know keras is not the go-to CV lib, but I know it well and now know the keras-cv also, and I am actually impressed. yolo - is like McD, its as quick and easy as a simple drive-through, so yes, now it is part of my toolbox, despite a bad performance here.
Well, congrats again, espcially for improving from public to private!
Thank you.
YOLO is brilliant and it has its advantages. Speed and ease of use as you said. I am sure there are competitors that used YOLO and achieved some real good results. There could be some YOLO tricks that I am yet to learn.