Primary competition visual

iX Mobile Banking Prediction Challenge

Helping Africa
$3 495 USD
Challenge completed over 4 years ago
Prediction
340 joined
103 active
Starti
May 17, 21
Closei
Jun 13, 21
Reveali
Jun 13, 21
User avatar
underfitting
Church of christ
How to validate models and select submissions in future
Help · 14 Jun 2021, 04:13 · 14

Hi everyone. I am surprised to see that there was a big difference between the scores on the public leaderboard and the ones on the private leaderboard. I thought that my best submission on the public leaderboard would have performed best on the private, but contrary to that, it performed worst. I had several submissions which performed much better on the private leaderboard than the submission which was best on the public leaderboard and the best submission on my local cross validation. But I failed to select those submission

I would like anyone to help me with tips and tricks to evaluate models locally and how to select submissions which will perform best on the private leaderboard.

You can comment below if you would like to share your experience on cross validation, public leaderboard and private leaderboard.

Discussion 14 answers

I would like also to these this kindly share the information

14 Jun 2021, 04:26
Upvotes 0
User avatar
ASSAZZIN

Hello,

This competition is not a good Example for Submission Selection. In fact, I Created this EDA Notebook to show that there is something weird.

  • All regions have the same Target Distribution ! [ 0 ~ 72.5% | 1 ~ 27.5% ]
  • All regions have the same number of countries ? | BTW in the whole world there is 195 Country lol !

As I said this competition is not a good example to show the strength of submission selection, For example, I selected 2 submissions, one with 0.5179 CV Score and another one with 0.5163 CV Score !

But I was surprised to see that they didn't perform well in private and that I have a winning submission 0.5084 with 0.5112 CV Score.

---------------------------------------------------------------------------

Anyway , maybe it was a chance for a lot of competitors to try some brilliant modeling - processing ideas .

Even it didn't work for you , don't panic you'll practice what you've learnt in other Good classifications competitions

---------------------------------------------------------------------------

Finally, I posted in this Discussion an Unofficial Winning Solution Code that shows how to reach easily 1st Place in this competition

User avatar
MICADEE
LAHASCOM

@AZZASSIN Infact i was totally agreed with all what you've said. This competition is not a good Example for Submission Selection. I was just laughing myself when i discovered that i had a very best score submitted like 25 days ago. Just a single model (LGBM) with 0.4984 Public Score and with 0.5040 Private LB Score. The funniest thing is that I keep on ensemble my models scores and it keeps increasing my LB score. Lol......

Lol.... What a challange!!! But nevertheless, the best approach is to practice what you've learnt in other "Good classifications competitions" like you said.

Well surprisingly, for me, complex models never performed well even on the private LB. My best model was just a Naive Bayes model. I just discovered that an ensemble of Naive Bayes variants had the best (~0.503), although I didn't select it.

User avatar
underfitting
Church of christ

Thank you very much @ASSAZZIN for sharing your experience and your solutions. I have been a victim of overfitting the public leaderboard.

I had a submission which had a cv score of 0.501, a public leaderboard score of 0.49 and a private score of 0.505. I don't understand why the scores were varying so much. Maybe train and test datasets had different distributions or the public leaderboard and the private leaderboard data had different distributions. I am not sure, I did not do proper EDA to check whether this was the scenario. I had given up after seeing that my scores were not improving on the public leaderboard.

Next time, I will not give up as early as I did.

User avatar
underfitting
Church of christ

I have been using catboost in many competitions because it usually performs well without tuning hyperparameters.

I agree that there are other competitions which the public leaderboard and cv scores reflect the scores on the private leaderboard. In this competition, it was difficult to identify which submission would perform best on the private leaderboard.

There are times which a competitor can choose to completely ignore the public leaderboard.

User avatar
underfitting
Church of christ

This sounds interesting @Gozie. How did you go about ensembling?

By averaging the predictions of the variant Naive Bayes models. The variant models (from scikit-learn) were the multinominal, categorical, and complement Naive Bayes models.

User avatar
MICADEE
LAHASCOM

Well Yes it happens a times especially with this kind of competition. Naive Bayes!!! Awesome!!! 👍

User avatar
MICADEE
LAHASCOM

Yes you're right. I think It's more about the kind of datasets provided. If a created EDA Notebook portrays or shows a lot of weird things, then It will be very difficult to know if your model performs better or not. So it is more of dataset provided not so much about the model implemented.

User avatar
underfitting
Church of christ

That is brilliant @Gozie. I had never thought that Naive Bayes would perform well in a competition.

User avatar
underfitting
Church of christ

I agree with you @MICADEE.

I moved from #96 to #18. I am totally surprised. I actually gave up on the competition cos my model was not improving despite trying different stuffs.

But from what I know, to pick the best submissions, almost all competitions allow you to pick two options.

One, pick the one that performs well on the public leaderboard and for the other, pick the one that performs best on the validation test set.

14 Jun 2021, 09:23
Upvotes 0
User avatar
underfitting
Church of christ

Thank you @SanJose. You have shared a good way of selecting submissions.

I did the challenge for an hour or two, then I gave up after realizing that my scores were not improving.