🏦 AI in Focus: Public LB 1st Place Notebook

Absa Corporate Client Activity Forecasting Challenge

Helping South Africa

$5 000 USD

Completed (~3 years ago)

Skills you will learn

Forecast

151 joined

46 active

Info Data Chat Leaderboard

Start

Nov 01, 22

Nov 27, 22

Reveal

Nov 27, 22

21db

Public LB 1st Place Notebook

Notebooks · 30 Nov 2022, 11:04 · 5

https://www.kaggle.com/code/danielbruintjies/absa-client-activity-forcasting-challenge/notebook

Solution: Ensemble of 4 Multivariate Multistep LSTM with UserID Embedding Models, training time ~ 10min per model on Kaggle GPU.

Main Learnings: 1. Tensorflow is a very good framework to quickly define and pick a cool model architecture but tf is not deterministic, and results are not reproducible... -> should have converted to a Pytorch model after finding the best architecture. 2. Should have taken the time to incorporate a better validation strategy, i.e. maybe have a validation set with a similar distribution of targets to the LB test set (for this case, only 4 targets per user). Because I did not do this, I did not successfully pick a model that would've done well on private. 3. Keep track of all features created and take time to look back on them, and make sure were worth throwing away. 4. Start competitions early so more time to learn.

Congrats to all and thank you @Zindi and Absa for hosting this cool competition, it was indeed a challenge!

Discussion 5 answers

skaak

Ferra Solutions

Nice thanks @DanielBruintjies and congrats! You really did well.

I like tf a lot, results should be reproducible? How about

tf.keras.utils.set_random_seed ( 123 )

but depends on tf version, I think kaggle has relatively old one.

Thanks for sharing - this is awesome.

You have relatively big model and big embeddings ... wow.

You see @wuuthraad you need GPU, D's model has 11+m weights.

30 Nov 2022, 11:28

Upvotes 0

skaak

Ferra Solutions

This is nice model - perhaps a bit big, look at graphs, model starts to overfit soon. How was performance if you used smaller model? Also - wow!!!! - seems like you added attn layer?

30 Nov 2022, 11:36

Upvotes 0

21db

Thanks! I spent a lot of time tuning the model, playing with different embedding sizes, layers (lstm/gru/cnn/dropout), and anything other than this combo decreased performance on my val set and I don't have the exact reasoning behind it (still got a lot to learn about DL in theory, like need to study what is attention exactly, why certain layers work and others not), so can't quite answer your question.

Regarding the seed, After lots of attempts and subs I found out pretty late there were already lots of discussions online about TF on gpu not being reproducible, and about version, the latest TF set_random_seed() has changed to set_seed().

replied to skaak30 Nov 2022, 12:07

Upvotes 0

skaak

Ferra Solutions

Ok. Yes I also saw stuff like that (tf and GPU), so, I guess, lucky I don't have gpu :-(

I had (very very) roughly similar model as yours. Embeddings, GRU based, using tf. What you did is to predict 0s and 1s whereas I predicted actual events. I think yours is a real nice simplification. Also the way you did inputs I think is very useful and practical and powerful! Sorry, just thinking out loud - really impressed with you approach. The model I think could perhaps be simpler, but the stuff around it is at such a very high level I can't help but be impressed.

replied to 21db30 Nov 2022, 15:04

Upvotes 0

21db

Thank you!

replied to skaak30 Nov 2022, 16:17

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status