Are you using yolo? If yes, you can try using a larger model like yolo11m.pt. It's better than increasing imgsz. You can reduce the imgsz to 960 if it's too slow. And you can train for more epochs.
But then the aim of deploying to low-end devices like smartphones gets defeated because the output will be too large. Target smaller versions of YOLO and sizeable amount from the dataset, then fine-tune from there.
Are you using yolo? If yes, you can try using a larger model like yolo11m.pt. It's better than increasing imgsz. You can reduce the imgsz to 960 if it's too slow. And you can train for more epochs.
But then the aim of deploying to low-end devices like smartphones gets defeated because the output will be too large. Target smaller versions of YOLO and sizeable amount from the dataset, then fine-tune from there.
I think you are right.
What is there else expext YOLO architecture ?