💰 Data Talk: which is best algorithim for t...

Zimnat Insurance Recommendation Challenge

Helping Zimbabwe

$5 000 USD

Challenge completed ~5 years ago

Skills you will learn

Prediction

Collaborative Filtering

1763 joined

612 active

Info Data Chat Leaderboard

Start

Jul 01, 20

Sep 13, 20

Reveal

Sep 13, 20

prashanth945v

Great lakes institute of Management

which is best algorithim for this data

Data · 12 Sep 2020, 09:42 · 12

can anyone suggest me which one was the best algorthim works for this data

Discussion 12 answers

ishankakkar

So far Catboost.

12 Sep 2020, 09:46

Upvotes 0

ishankakkar

Catboost and Feed forward neural networks get you in the 0.027-0.028 range that is for me this was the case. Eventually Catboost trumped it. Make sure to run any of the NN algos on Google colab to use its higher GPU bandwidth. Otherwise it is very very taxing.

12 Sep 2020, 09:49

Upvotes 0

garikhgh

You can also use kaggke privit notebook with GPU, it works well. The data is available there.

12 Sep 2020, 14:58

Upvotes 0

Guillaume_Filteau

LightGBM ended up working the best for me, but it required a lot more tweaking than Catboost did.

14 Sep 2020, 02:00

Upvotes 0

ishankakkar

Can you share your code. Would like to see your approach and learn from it.

replied to Guillaume_Filteau14 Sep 2020, 06:16

Upvotes 0

kejiaq

5-fold LGB gave me the best result

14 Sep 2020, 02:25

Upvotes 0

Icfstat

Lightgbm in dart mode had the best results according to cross validation. However, catboost had the best results according to public and private leaderboard. I should have put more weight to catboost in my ensemble to score higher. At first, I thought that my catboost model were overfitting to public leaderboard, but it wasn't the case.

14 Sep 2020, 06:00

Upvotes 0

ssshch

could your share please your params? for me catboost and xgb perform worse.

replied to Icfstat14 Sep 2020, 06:31

Upvotes 0

elijah-a-w

@lcfstat, just asking-how do you get to see how your model is peforming from the private leaderboard?

replied to Icfstat14 Sep 2020, 09:57 (edited less than a minute later)

Upvotes 0

Icfstat

@ssshch { 'boosting_type': 'dart', "max_depth":-1, "num_leaves":32, 'learning_rate': 0.1,"min_child_samples": 20, 'feature_fraction': 0.8,"bagging_freq":1,'bagging_fraction': 0.9,"lambda_l1":1,"lambda_l2":1}

However, I think you should be carefull with the number of iterations because dart mode does not support early stopping rounds.

replied to ssshch15 Sep 2020, 06:39

Upvotes 0

Icfstat

@elijah-a-w Once the competision has finished, all participants can see their private leaderboard scores in the submissions menu.

replied to elijah-a-w15 Sep 2020, 06:44

Upvotes 0

ssshch

My approach was sum of lgbm multiclass and binary problems.

one hot of categorical features, tfidf and tuning lambda_l2 increased my score.(+/- 40 place on public and privat)

14 Sep 2020, 06:40

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status