Alvin Smart Money Management Classification Challenge
Can you classify purchases recorded on Alvin into different categories?
$3 000 USD
Ended 6 months ago
220 active · 455 enrolled
Financial Services
Congratulations to the winners
Platform · 25 Jul 2022, 06:20 · 6

Just trust your private LB. When I saw the public LB score, I was so determined that I ignored my private LB. I wish I could rely on my CV. I'm excited to see what the top five did to get their scores. Congratulations to the winners.

Discussion 6 answers

Congratulations to the winners. I'd love to hear how those with top scores handled their cross-validation and avoided overfitting as well as feature engineering techniques used.

25 Jul 2022, 10:26
Upvotes 0

Congratulations to all. Thanks to my teammate @Koleshjr. It was really an amazing competition we learned a lot from it and thanks to @zindi for hosting such a competition.

Our approach is

  • Deal with missing values
  • Drop unnecessary columns
  • Drop outliers (rows with the same data but different categories)
  • Count encoding & new features
  • MERCHANT CLUSTERING to 10 categories
  • Aggregation and feature combination
  • Automatic clustering using KMeans from CountVectrozer
  • Decomposition of features
  • Oversampling using SMOTE
  • StratifiedKFold with 5 split
  • Voting classification and model stacking

Did you decompose all features or it's only the outputs from CountVectorizer? If you don't mind, please share the method as well

No we decomposed the aggregate features which were formed from the MERCHANT_NAME. The outputs of CountVectrizer clustered into groups using KMeans clustering.

Can you share your solution please 🙏?

Sorry but Zindi is validating our solution, till then we can't share the notebook.