Primary competition visual

TAHMO Incoming Solar Radiation Prediction Challenge

$10 000 USD
Under code review
Prediction
Geospatial Analysis
1525 joined
760 active
Starti
Apr 01, 26
Enrolments closei
May 24, 26
Closei
May 24, 26
Reveali
May 24, 26
Detecting multi-accounting, TAHMO edition
22 May 2026, 17:53 · 0

Edit: I posted this in the wrong place originally, woops.

I wanted to wait with this post until the end of the competition, mostly so that it is too late for multi-accounting competitors to cover their tracks.

Platform specific multi-accounting measures are hard to enforce/can always be gamed - but the TAHMO challenge has some unique characteristics that make multi-accounting behaviour detectable.

# 1. Low MBE scores without probing

Context: The per-station solar radiation bias values are hugely different between train and test sets. Basically, no method built around the public dataset can get leaderboard MBE values lower than ~1.8; to go below that, you would need to probe the leaderboard with carefully constructed submissions. The easiest example would be: make a submission X=0 with all zeros; then make a submission X_i, where the predicted radiation value of station i is set 1500; the difference between the MBE scores of X and X_i lets you recover the exact MBE of station i

How is this useful: to get a near-zero leaderboard MBE, you would need at least 41 probing submissions. Even for getting MBE down to <1, you would need a few probing submissions, to correct the stations with the highest bias.

Multiaccounting test: Any account that gets the bias of station i right (especially for those ill-behaving stations that come as a surprise) without a history of probing submissions, is extremely suspicious.

# 2. Correlating submission files across accounts

Context: even when your datasets and engineered features are the same, training an lgbm model vs xgb will give you small differences. You can use cosine similarity as a measure of similarity; in fact, this comes in handy when putting together your final ensemble models (you don't want to keep too many submissions that are too similar/highly correlated with each other).

How is this useful: typically, in my final submissions, I still had ~0.95 cosine similarity across the different model familias (lgbm, xgb, catboost, transformer). The odds of two accounts getting high cosine similarity (>0.99) across their submissions is extremely low.

Multiaccounting test: check cosine similarity across all submission pairs (across all users). You can build a graph out of this (nodes: individual accounts; vertice between account A and B if they have a submission pair with cosine similarity > threshold). This should reveal clusters of multi-accounting, but any two accounts having correlated submisisons are suspicious.

# 3. Let the community figure it out?

Once the competition is over, I would propose to release the submissions + submission history of each participant; let the community do the auditing.

No need to share the code - that is a very private matter. But just the submission files alone gives a rich enough information set to be able to detect multi-accounting.

Alright, ~2 more days left, best of luck to all!

Discussion 0 answers