I would like to raise a critical issue regarding the dataset structure of the challenge that allows anyone to achieve a perfect score of 1.0 on the leaderboard without training any machine learning model.
The Issue: Monotonic target sorting leak
It appears that the original dataset was sorted by the target label column (all 0s first, followed by all 1s) before being split into the Train and Test sets, and the original row order was preserved without shuffling:
Train Set: Train.csv is strictly sorted by the target label (the first 570 rows have label 0, and the remaining 393 rows have label 1).
Test Set: Because the split preserved the row order, the hidden test labels are also strictly sorted. This is statistically proven by the correlation vector of the features with the target in Train, which correlates at 0.911 with the correlation vector of the same features with the row index in Test.
How it can be exploited:
Because the test set is sorted by label, a participant can achieve a perfect score of 1.0 (ROC-AUC = 1.0, F1-Score = 1.0) simply by:
Setting TargetRAUC as a strictly increasing function of the row index (e.g. row_idx / len(test)), which ranks all 1s above all 0s, guaranteeing a perfect ROC-AUC.
Setting TargetF1 to a step function at index 515 (predicting 0 for the first 515 rows, and 1 for the rest), which matches the true sorted labels.
Impact on the Competition:
This leak bypasses all machine learning efforts, prevents real innovation, and compromises the integrity of the leaderboard (which explains the perfect 1.0 score currently visible).
Thank you for your work in organizing this competition, and I hope this feedback helps improve the challenge.
Oh and I was thinking the competition has ended😅
O.M.G. 😂😂😂
Independently confirmed proposing a reshuffle + rescore.
I can reproduce this finding.
The practical effect is that genuine modelling signal is masked on the public leaderboard. Could the organizers please shuffle the test row order and rescore the existing submissions? That removes the exploit without altering anyone's actual predictions, and restores the leaderboard as a measure of real model quality.
Thanks for catching and raising this.
I agree.
I think they should acknowledge this and then reopen or rescore using a clean, shuffled/private test set with random IDs.
@Zindi Could the you please shuffle the test row order and rescore the existing submissions?