Hello participants,
Following a recent investigation, we have identified that some submissions may be benefiting from unintended signals related to geospatial information. Specifically, it is possible to significantly improve leaderboard scores by leveraging latitude/longitude data in combination with external occurrence datasets. We want to remind you that, as per the competition info and rules:
These rules are in place to ensure a fair and consistent evaluation based on the intended TerraClimate features.
If your approach may have used restricted data or signals (including indirect use of latitude/longitude or external datasets), please take one of the following actions by 23:59 GMT on Saturday 25 April:
As per Zindi’s competition rules, we reserve the right to request code from any participant at any time. Failure to comply or submission of non-compliant solutions may result in disqualification from this challenge.
This is about ensuring fairness and maintaining the integrity of the challenge for everyone. Thank you!
Hello @meganomaly I didn't send my leaked submissions to you because I was busy. Can I still proceed ?
I realise some participants may have missed this message on Friday. I am willing to extend the deadline to 17:00 GMT today. @marching_learning @witcher_007 @Mutombwa @maz @CelestineC @chiwai @Sylvester @MICADEE @sroy_097 @Knowledge_Seeker101 @CodeJoe @Sammy_S_Mutuku @serahnjogu @ML_Wizzard @Joseph_gitau @aspita @Johvans @DarthVader123 @Hafiz_Adjei @Johannes_G @MetaLearner @vengatesh @abdelrhman012018 @Bwenge840 @Muzalendu @Kilonzo @Manuel786 @lokpatchu @CisseJ @machine_learning @eyadhjarray111 @mlnjsh @chuckML @abuuchilax @Jameel @Hendrixx @GodsonNtungi @game_ale @okech @Paul_O @rukundob451 @Frackson @Djibrilla @Elzamkan @britonmichael7 @xplicit756 @emmanuel2 @ozumba @tImIhAcK @cookieML
i send the code to you
Hello @meganomaly, I did send my code to (zindi[@]zindi.africa) on Sunday evening. But I will for sure forward it to you as well. Thank you.
Hi @meganomaly,
I'd like to ask for confirmation. Does using distance based features computed from the lat and lon features violate the rule of not using latitude and longitude features?
Yes it violates. That is why I dropped my best sub.
Yes, using distance-based features computed from latitude and longitude would still be considered a violation of the challenge rules. Even if you’re not including the raw Latitude and Longitude columns directly in your model, any features derived from them (e.g. distances between points, nearest neighbours, clustering, grid cells, etc.) are effectively reintroducing spatial information. This goes against the intent of the rule, which is to ensure models rely on the provided environmental/climate data rather than geographic location. A good rule of thumb:
So in this case, distance-based features would fall into the second category and should be avoided.
See Data section:
Ok. Thank you.
@meganomaly I see.. Thank you for this further clarification.
Hi @meganomaly, apologies for missing the deadline on Sunday. I'm formally requesting that all my submissions on this competition be removed from the leaderboard. My approach used external occurrence data (Atlas of Living Australia / GBIF) and distance-based features computed from lat/lon -- both violations of the rules clarified in this thread. I'll start over using only the provided TerraClimate features. Please let me know if you need anything else from me. Thanks for your patience. -- Mutombwa