is there final code reproduceing test without post-process step? my lb score 0.002447672 -> 0.002315709 boost by https://zindi.africa/competitions/inundata-mapping-floods-in-south-africa/discussions/25014
if it better if you are validation model raw prediction only, help to generalize outcome.
thanks to https://zindi.africa/users/snow point out before competition end.
such post processing is illegal. it's why those who had 0.0020x scores or better are not at the top. they didn't select those dodgy submissions. looks like you gambled here and now want special rules.
what your raw prediction score on lb?
0.0021x. @bit_guber I might have completely misunderstood you - my apologies
our score on the leaderboard is without the row order leak @bit_guber
I don't know for the others but our 0.0020x submission didn't use any leak and any post processing 😊. It just overfited to the training data and public LB despite rigorous cross validation. Sometimes it may happen.
so sorry @marching_learning you were very strong in this one. Which cross validation technique did you use?
I see. Apologies for the general statement. You did mention in that leak thread that you were not using such. Curious to know what the 0.0020x sub scored on the private set.
I proceed like others, It is stratified kfold on locations based on wheter they where a flood or not. I think it may be caused by one my feature with distribution shift or the usage of pseudo labelling. But I didn't investigate thoroughly yet. I'm a bit guted. This sub scored 0.0027 in private. It is my only submission (among those <0.0026) with CV-private GAP bigger 0.0002. But at the end, I know it was risky.
Yeah that was a solid cv approach and so I don't think it was a cv issue ssince the gap from 0.0020 to 00027 is pretty huge . We never tried pseudo-labelling so I cannot comment on that. But still getting 0.0020 on the plb with no post processing is pretty impressive. Once again sorry for that bad luck on private