🌊 Hot Topic: Code verification parts help

Inundata: Mapping Floods in South Africa

Helping South Africa

$10 000 USD

Completed (over 1 year ago)

Skills you will learn

Classification

1342 joined

314 active

Info Data Chat Leaderboard

Start

Nov 22, 24

Feb 16, 25

Reveal

Feb 17, 25

bit_guber

Code verification parts help

Platform · 17 Feb 2025, 05:54 · 9

is there final code reproduceing test without post-process step? my lb score 0.002447672 -> 0.002315709 boost by https://zindi.africa/competitions/inundata-mapping-floods-in-south-africa/discussions/25014

if it better if you are validation model raw prediction only, help to generalize outcome.

thanks to https://zindi.africa/users/snow point out before competition end.

Discussion 9 answers

nymfree

such post processing is illegal. it's why those who had 0.0020x scores or better are not at the top. they didn't select those dodgy submissions. looks like you gambled here and now want special rules.

17 Feb 2025, 06:05

Upvotes 0

bit_guber

what your raw prediction score on lb?

replied to nymfree17 Feb 2025, 06:10

Upvotes 0

nymfree

0.0021x. @bit_guber I might have completely misunderstood you - my apologies

replied to bit_guber17 Feb 2025, 06:13

Upvotes 0

Koleshjr

Multimedia university of kenya

our score on the leaderboard is without the row order leak @bit_guber

replied to bit_guber17 Feb 2025, 07:49

Upvotes 0

marching_learning

Nostalgic Mathematics

I don't know for the others but our 0.0020x submission didn't use any leak and any post processing 😊. It just overfited to the training data and public LB despite rigorous cross validation. Sometimes it may happen.

replied to nymfree17 Feb 2025, 08:21

Upvotes 2

Koleshjr

Multimedia university of kenya

so sorry @marching_learning you were very strong in this one. Which cross validation technique did you use?

replied to marching_learning17 Feb 2025, 08:23

Upvotes 0

nymfree

I see. Apologies for the general statement. You did mention in that leak thread that you were not using such. Curious to know what the 0.0020x sub scored on the private set.

replied to marching_learning17 Feb 2025, 08:26

Upvotes 0

marching_learning

Nostalgic Mathematics

I proceed like others, It is stratified kfold on locations based on wheter they where a flood or not. I think it may be caused by one my feature with distribution shift or the usage of pseudo labelling. But I didn't investigate thoroughly yet. I'm a bit guted. This sub scored 0.0027 in private. It is my only submission (among those <0.0026) with CV-private GAP bigger 0.0002. But at the end, I know it was risky.

replied to Koleshjr17 Feb 2025, 08:28

Upvotes 1

Koleshjr

Multimedia university of kenya

Yeah that was a solid cv approach and so I don't think it was a cv issue ssince the gap from 0.0020 to 00027 is pretty huge . We never tried pseudo-labelling so I cannot comment on that. But still getting 0.0020 on the plb with no post processing is pretty impressive. Once again sorry for that bad luck on private

replied to marching_learning17 Feb 2025, 08:43

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status