Seems someone is bringing cheating to a whole new industrial level. All 8 of these accounts were created today, made their 10 submissions (the max for the day allowed), and have very similar, great scores:
What is interesting about this is that it is attacking RMSE. With near infinite LB submissions, you could slowly but steadily narrow down what the true (public) LB values are, giving you an amazing advantage. You can then just come up with just the right, magical formula weighted ensemble model combination strategy that just happens to get extremely close to the true LB values.
I think this is where the community can come together and come up with smart ways to detect cheaters - because honestly, I think the organizers might get fooled here, even if they have code in their hand.
So side competition, everyone - given the code + reproducible way of getting to a (winning) submission, how would you prove someone is cheating? I leave it open ended for now, but I have some ideas..
I mean for now it is very easy to create accounts. My immediate suggestion would be for organizers to implement KYC verification, similar to how Kaggle handles it. Users would need to verify their identity before joining competitions.
And it’s not just one person , both this competition and the Telco Agentic Challenge have had similar issues, likely because of the limited number of submissions allowed, which creates incentives for multi-accounting. I think this is becoming difficult for organizers to manage manually.
@AJoel @meganomaly
Another thing organizers could consider is setting a cutoff date for joining competitions, maybe one or two weeks before the final submission deadline. That would make it harder for people to create fresh accounts near the end just to gain extra submissions or bypass limits through multi-accounting.
@Koleshjr you have to Consider the cost
What KYC requires
This is expensive not just in compute, but in human operations + legal overhead.
Kaggle can absorb this because it is:backed by Google infrastructure-heavy and positioned as semi-professional ecosystem.
@Semaka_Mathunyane Thanks for educating me. I think @JONNY suggestion is the most immediate fix then
That’s fair, but organizers don’t necessarily need to build and operate the entire KYC stack themselves anymore. There are third party providers like Persona, Sumsub, Onfido, etc. that handle document verification, storage compliance, and much of the regulatory overhead.
Kaggle itself uses Persona for identity verification. So the model already exists for competition platforms that want stronger anti multi-account protections without building the infrastructure from scratch.
Hi all,
Thank you for raising these concerns — we genuinely appreciate the engagement and the thoughtful suggestions from the community.
We want to assure you that we take cheating and fairness seriously. Many of the points raised are already being actively explored or addressed, and we are committed to responding to issues in a timely and transparent manner.
We will share updates as soon as we are in a position to do so.
Hi @AJoel,
For this competition, could you please remove the following accounts:
All of these accounts were created on the same day, and there is a clear pattern suggesting duplicate activity.
Hi all,
Thank you for continuing to raise these concerns. Following a review, the users listed, along with several others, were removed from the leaderboard and placed on probation in accordance with Zindi’s rules.
Hey, I think you only cleaned up the leaderboard from the probing secondary accounts - but these were not the main accounts of whoever was cheating.
detect multi-account behavior via device/IP clustering? perharps
I totally agree with the concerns about multiple accounts: that is a real design flaw of the platform that has become evident in these last days of competition. However, I would like to point out that my last submission was 5 days ago and my account is older, so at least one assumption here is incorrect. I am sure the admins will handle the situation in the best way possible.
Sorry about that, I made a screenshot of all those accounts, and I listed you there as well. I apologize
We still have more. This accounts were all created same day
I seriously doubt how only with less than 10 submissions, accounts top in the LB!?