What’s the point of calling it a competition if originality doesn’t matter anymore? Some of us have spent weeks experimenting, building, and refining our own ideas from scratch, only to watch others skyrocket in rank by literally cloning a public solution.
It’s disheartening, it’s unfair, and it completely destroys the spirit of this challenge. The whole point of Zindi is to encourage creativity, learning, and fair play — not to reward those who break the rules and turn the leaderboard into a playground for duplicates.
I really hope the organizers take immediate action. Because right now, this doesn’t feel like a fair competition anymore — it feels like a joke at the expense of those who actually put in real effort.
We don't wat any newbee to come and submit just one or two submission and grab a spot that another participant has been working hard to achieve.
I would like to call out @Amy_Bray to look into this matter, is it even fair? Providing a starter notebook like the one Zindi provides does not give any proper idea about the pipeline.
This challenge has just turned out to be a copycat show.
Tbh atm the chaos has not even begun yet!! Atm there are no copycats in the top 10. Everyone who is in the top 10 was there already before that solution was posted, but in the next few days you will see the changes that will happen. Better put your efforts in the segmentation part now!
Calling them "copycats" is insane
All I can say is may the best Win. I wish you all the best!
Are you on Kaggle ?
Do you one time, took part at one comp on this platform. Go there and you will see.
I think Zindi is more safe because we don't have a dead section called "Code" like on Kaggle.
Good Luck 🤞
The competition is made of two parts. Posting a solution to half of it to help newcomers is not only acceptable but it is actively encouraged in the community. And no one is going to break into top 20 just by copying the solution posted so this is not as big as a problem as you make it seem.
Let's be honest. You wouldn't say this if you were in the top of the LB. What Joseph shared is THE top OCR solution that he has had for nearly a month. He probably shared it out of frustration as he is now #2 after leading for so long.
Probably, I'm not at the top of the leaderboard so I don't know how I'd react if I were. But I don't think he posted it out of frustration. His OCR solution was not the top solution, it was his segmentation solution. Posting the OCR solution will only make him more vulnerable to falling off the leaderboard
My best-performing solution uses a completely different model and approach 😂😂. So don't be judgemental. This ain't about winning. When I asked questions concerning the data issues no one made this much effort to do follow up but when a solution is shared you all are afraid?
Lol ok. We are all afraid. Lol. Anyway, let the fun begin.
Everyone understands the frustration but @Koleshjr also provided a pretty solid OCR solution and.....
Isn't it basically the same thing Joseph aimed to do.
To be honest, it's quite frustrating but he's not entirely wrong and accusing him for doing that out of pure frustration is just far fetched
my solution is a direct implementation of the public Unsloth notebook, only changed the data importation and a few hyperparameters. I haven't even shared the inference notebook or the prompt.
also what I shared is somehow similar to koleshjr work on the stream.
@crossentropy Timing my bro , Timing. But true it has happened and so we go back to competing , we still have the segmentation part to figure it out.
Given that there are no constraints on the final model’s inference requirements, the VLLM approach is probably the best option here, especially since the rule states that no data manipulation is allowed to train traditional OCR models and the given label does not allow you to train those models directly.
Honestly, I feel bad for not thinking of this approach earlier.