Edit: I posted this in the wrong place originally, woops.
I wanted to wait with this post until the end of the competition, mostly so that it is too late for multi-accounting competitors to cover their tracks.
Platform specific multi-accounting measures are hard to enforce/can always be gamed - but the TAHMO challenge has some unique characteristics that make multi-accounting behaviour detectable.
Context: The per-station solar radiation bias values are hugely different between train and test sets. Basically, no method built around the public dataset can get leaderboard MBE values lower than ~1.8; to go below that, you would need to probe the leaderboard with carefully constructed submissions. The easiest example would be: make a submission X=0 with all zeros; then make a submission X_i, where the predicted radiation value of station i is set 1500; the difference between the MBE scores of X and X_i lets you recover the exact MBE of station i
How is this useful: to get a near-zero leaderboard MBE, you would need at least 41 probing submissions. Even for getting MBE down to <1, you would need a few probing submissions, to correct the stations with the highest bias.
Multiaccounting test: Any account that gets the bias of station i right (especially for those ill-behaving stations that come as a surprise) without a history of probing submissions, is extremely suspicious.
Context: even when your datasets and engineered features are the same, training an lgbm model vs xgb will give you small differences. You can use cosine similarity as a measure of similarity; in fact, this comes in handy when putting together your final ensemble models (you don't want to keep too many submissions that are too similar/highly correlated with each other).
How is this useful: typically, in my final submissions, I still had ~0.95 cosine similarity across the different model familias (lgbm, xgb, catboost, transformer). The odds of two accounts getting high cosine similarity (>0.99) across their submissions is extremely low.
Multiaccounting test: check cosine similarity across all submission pairs (across all users). You can build a graph out of this (nodes: individual accounts; vertice between account A and B if they have a submission pair with cosine similarity > threshold). This should reveal clusters of multi-accounting, but any two accounts having correlated submisisons are suspicious.
Once the competition is over, I would propose to release the submissions + submission history of each participant; let the community do the auditing.
No need to share the code - that is a very private matter. But just the submission files alone gives a rich enough information set to be able to detect multi-accounting.
Alright, ~2 more days left, best of luck to all!