Primary competition visual

Alvin Smart Money Management Classification Challenge

Helping Kenya
$3 000 USD
Challenge completed ~3 years ago
Classification
497 joined
220 active
Starti
Jun 22, 22
Closei
Jul 24, 22
Reveali
Jul 24, 22
User avatar
msamwelmollel
University of Glasgow
Congratulations to the winners
Platform · 25 Jul 2022, 06:20 · 6

Just trust your private LB. When I saw the public LB score, I was so determined that I ignored my private LB. I wish I could rely on my CV. I'm excited to see what the top five did to get their scores. Congratulations to the winners.

Discussion 6 answers

Congratulations to the winners. I'd love to hear how those with top scores handled their cross-validation and avoided overfitting as well as feature engineering techniques used.

25 Jul 2022, 10:26
Upvotes 1
User avatar
J0NNY
Adama science and technology university

Congratulations to all. Thanks to my teammate @Koleshjr. It was really an amazing competition we learned a lot from it and thanks to @zindi for hosting such a competition.

Our approach is

  • Deal with missing values
  • Drop unnecessary columns
  • Drop outliers (rows with the same data but different categories)
  • Count encoding & new features
  • MERCHANT CLUSTERING to 10 categories
  • Aggregation and feature combination
  • Automatic clustering using KMeans from CountVectrozer
  • Decomposition of features
  • Oversampling using SMOTE
  • StratifiedKFold with 5 split
  • Voting classification and model stacking

Did you decompose all features or it's only the outputs from CountVectorizer? If you don't mind, please share the method as well

User avatar
J0NNY
Adama science and technology university

No we decomposed the aggregate features which were formed from the MERCHANT_NAME. The outputs of CountVectrizer clustered into groups using KMeans clustering.

Can you share your solution please 🙏?

User avatar
J0NNY
Adama science and technology university

Sorry but Zindi is validating our solution, till then we can't share the notebook.