📊 AI in Focus: Detecting multi-accounting, TA...

TAHMO Incoming Solar Radiation Prediction Challenge

$10 000 USD

Under code review

Skills you will learn

Prediction

Geospatial Analysis

1525 joined

760 active

Info Data Chat Leaderboard

Start

Apr 01, 26

Enrolments close

May 24, 26

Reveal

May 24, 26

fgbfgb

Detecting multi-accounting, TAHMO edition

22 May 2026, 17:53 · 0

Edit: I posted this in the wrong place originally, woops.

I wanted to wait with this post until the end of the competition, mostly so that it is too late for multi-accounting competitors to cover their tracks.

Platform specific multi-accounting measures are hard to enforce/can always be gamed - but the TAHMO challenge has some unique characteristics that make multi-accounting behaviour detectable.

# 1. Low MBE scores without probing

Context: The per-station solar radiation bias values are hugely different between train and test sets. Basically, no method built around the public dataset can get leaderboard MBE values lower than ~1.8; to go below that, you would need to probe the leaderboard with carefully constructed submissions. The easiest example would be: make a submission X=0 with all zeros; then make a submission X_i, where the predicted radiation value of station i is set 1500; the difference between the MBE scores of X and X_i lets you recover the exact MBE of station i

How is this useful: to get a near-zero leaderboard MBE, you would need at least 41 probing submissions. Even for getting MBE down to <1, you would need a few probing submissions, to correct the stations with the highest bias.

Multiaccounting test: Any account that gets the bias of station i right (especially for those ill-behaving stations that come as a surprise) without a history of probing submissions, is extremely suspicious.

# 2. Correlating submission files across accounts

Context: even when your datasets and engineered features are the same, training an lgbm model vs xgb will give you small differences. You can use cosine similarity as a measure of similarity; in fact, this comes in handy when putting together your final ensemble models (you don't want to keep too many submissions that are too similar/highly correlated with each other).

How is this useful: typically, in my final submissions, I still had ~0.95 cosine similarity across the different model familias (lgbm, xgb, catboost, transformer). The odds of two accounts getting high cosine similarity (>0.99) across their submissions is extremely low.

Multiaccounting test: check cosine similarity across all submission pairs (across all users). You can build a graph out of this (nodes: individual accounts; vertice between account A and B if they have a submission pair with cosine similarity > threshold). This should reveal clusters of multi-accounting, but any two accounts having correlated submisisons are suspicious.

# 3. Let the community figure it out?

Once the competition is over, I would propose to release the submissions + submission history of each participant; let the community do the auditing.

No need to share the code - that is a very private matter. But just the submission files alone gives a rich enough information set to be able to detect multi-accounting.

Alright, ~2 more days left, best of luck to all!

Discussion 0 answers

Join the largest network for
data scientists and AI builders

About FAQs

Status