🌊 Data Talk: Single day prediction too chal...

Inundata: Mapping Floods in South Africa

Helping South Africa

$10 000 USD

Completed (over 1 year ago)

Skills you will learn

Classification

1342 joined

314 active

Info Data Chat Leaderboard

Start

Nov 22, 24

Feb 16, 25

Reveal

Feb 17, 25

capybara_lover

Single day prediction too challenging?

Help · 29 Dec 2024, 13:37 · 3

I've been experimenting with models that take as input the image of a location and the time series of precipitation data for that event and try to predict, for each day independently, if a flooding event happened that day (so 730 outputs with sigmoid activations to predict a probability for each day).

However I seem to not be able to get a score lower than 0.0032 with this approach. When I visualize the predictions with my best model on the validation set, it almost never detects the exact day the flood happens, albeit some times it correctly detects if there was a flooding or not in the considered time series.

I suspect that there are too few examples in the training set to be able to accurately predict the day a flooding happened, which is the entire point of this competition, so would it maybe make more sense to make assumptions like: there is at most one flooding event in a given time series, and adapt the model accordingly?

Any thoughts and experience shared would be very welcome.

Discussion 3 answers

capybara_lover

I've been asked more hints on how I use the images in my model. I made a custom neural network with two inputs. One is a convolutional image encoder that maps the (128, 128, 6) image input into an output vector, then this vector is repeated 730 times for each precipitation value (remember, we only have one image for each time series), and is concatenated to each value. Then a Bidirectional (because we are looking at the data post-hoc) LSTM layer with a sigmoid output for each day maps the output to a single day probability as I said before.

Hope this helps!

30 Dec 2024, 09:17

Upvotes 2

marching_learning

Nostalgic Mathematics

Are you using keras or pytorch ? And do you use any feature engineering. I've replicated the same neural net architecture with tensorflow but I can't reach 0.0032.

replied to capybara_lover2 Feb 2025, 10:03

Upvotes 0

capybara_lover

I used Keras, splitting the dataset so that both training and validation set have the same number of time series with a flooding event (47%).

By the way I tried removing the images from the model, training only on the time series and got pretty much the same results. The images seems to be useless for me

replied to marching_learning2 Feb 2025, 10:47

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status