Financial Inclusion in Africa
Knowledge
Predict who is most likely to have a bank account
1729 data scientists enrolled, 788 on the leaderboard
Financial ServicesPredictionStructured
29 July 2019
Bank Account Classification | Sharing Experience | Four Days
published 16 Sep 2019, 16:14

Hi all,

As this challenge could be considered as a learning challenge, I want to share my experience about this project. Actually, I spent a lot of efforts on this project which is lasted about 4 days. I remeber that I've submitted my first result after 6 hours of joining the challenge. At first sight, I thought It's easy to handle, but I just got stuck with the rank!!

By looking at the Leaderboard, It's obivous that there's ability to enhance and maintain the model I've built. Therefore, I've started to examine the dataset closely, and applying bunch of statistics and feature selection methods to find out the most related data. Ultimately, I re-trained my model again after selecting convenient hyper parameters, but I've noticed that the validation dataset accuracy get fatigue when the accuracy reachs around 88.5 ~ 88.7 %. For me, It was just like a black box cause of many concerns I'll state it:-

1- I found out that the freezing/stopping of validation dataset loss decreasing isn't related to the model I've used. It's not even related related to the method of training/optimizer. ( I observed that by making many test cases)

2- The features I've used were picked up by applying many methods as I mentioned before. In particular, by eliminating un related features and empasizing the important ones.

3- Over different (techniques - optimizers - activation functions), the model always gets fatigue at the same accuracy that I mentioned before. Not only that, but also the output of different optimizers/techniques give exactly the same prediction for the test-dataset.csv which is kinda weird cause different (optimizers- differentiation of intial starting points) aren't likely to get the same prediction .

Afterwards, I could conclude that the only way to break this Pandora box is to examine the validation-dataset which classified incorrectly. At this conclusion, I just stopped :D.

By the way, I wanted to share that to know the others review and comments for their exp. with that project.