🌊 Must-Read: How to break past under 0003...

Inundata: Mapping Floods in South Africa

Helping South Africa

$10 000 USD

Completed (over 1 year ago)

Skills you will learn

Classification

1342 joined

314 active

Info Data Chat Leaderboard

Start

Nov 22, 24

Feb 16, 25

Reveal

Feb 17, 25

marching_learning

Nostalgic Mathematics

How to break past under 0.003 ?

Help · 5 Feb 2025, 11:05 · 10

Hello guys, I hope you're enjoying this challenge. I have been scratching my heed with both neural nets and boosting models. Still I couldn't break under 0.003. I will appreciate if you mind sharing some tips.

Happy Zinding !!!

Discussion 10 answers

crossentropy

Federal university of Technology, Akure

Same here, struggled to break under this range too regardless of the approach.

I'm here for the tips too 👍

5 Feb 2025, 11:10

Upvotes 1

Semaka_Mathunyane

University of South Africa

Atleast you got 0.003 mine is a disaster

5 Feb 2025, 11:27

Upvotes 1

capybara_lover

Looks like boosting models are the successful approach here

I've got 0.0027 on leaderboard with a LightGBM model averaging the predictions of a 10-fold CV, stratifiying the time series according to whether or not they contain a flood

For each day I simply used as features the day number (0-729), the precipitation value, and all the other days precipitation values (729 lags)

I'm sure with some parameter tuning and better featurization the score can improve

I'm curious if any numeric feature calculated from images can help, in all my experiments the images were of no help

6 Feb 2025, 09:47

Upvotes 7

marching_learning

Nostalgic Mathematics

Thank you for sharing. So for a given day let say day t, you are using lags from day t-1, day t-2,....., to day 1.

replied to capybara_lover6 Feb 2025, 10:04

Upvotes 0

capybara_lover

I also wrap around, so for day t I use the previous t-1 precipitations and the following 730 - t ones, as well as the precipitation of the day itself obviously

replied to marching_learning6 Feb 2025, 10:49

Upvotes 1

sys_ts__

Very interesting, mine also works very well with lightgbm, in general I see classifiers work better than regressors. So far I am still trying to combine lightgbm and MLPClassifier. It seems that both models work quite well.

replied to capybara_lover8 Feb 2025, 02:39

Upvotes 2

Zambia_Kuchalo

Typaflow Software Systems Limited

Sounds Great. Are you applying anything to the data, any tips?

replied to capybara_lover8 Feb 2025, 13:30

Upvotes 0

CodeJoe

I am using boosting models and I have tried placing lags as you said. I have tried winsorization, i have tried removing outliers, I have tried groupkfold, I have tried stratifiedkfolds, I did extensive feature engineering and still no significant boost. Am I missing something here?

replied to capybara_lover8 Feb 2025, 15:02

Upvotes 0

capybara_lover

Sorry to hear bro but the basic setup is really simple, just create lags for each day (I used cycling but also padding yields the same results) of each time series and binary classify each day, no particular preprocessing as tree models are not sensitive to data range.

As for the split, for each time series id, assign 1 if it contains a flood and 0 otherwise, then split the dataset so that both training and validation have the same percentage of time series with floods. Nothing else

replied to CodeJoe8 Feb 2025, 16:15

Upvotes 6

CodeJoe

Worked like magic ! I'm really grateful.

replied to capybara_lover8 Feb 2025, 20:03

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status