Alvin Smart Money Management Classification Challenge
Can you classify purchases recorded on Alvin into different categories?
$3 000 USD
Ended 6 months ago
220 active · 455 enrolled
Financial Services
Log loss issue
Help · 7 Jul 2022, 20:44 · 7

Does anyone please know how to solve this issue; Thank you!

ValueError: y_true and y_pred contain different number of classes 8, 11. Please provide the true labels explicitly through the labels argument. Classes found in y_true: ['Bills & Fees' 'Data & WiFi' 'Going out' 'Groceries' 'Health' 'Loan Repayment' 'Shopping' 'Transport & Fuel']

Discussion 7 answers

The problem is your train and validation set.

7 Jul 2022, 20:51
Upvotes 1

Okay! What is specifically wrong with it, please? @Raheem_Nasirudeen. I can't seem to understand the issue

Use model.predict_proba(test) instead of d normal predict function

7 Jul 2022, 20:57
Upvotes 0

That is what I am using; predict_proba

I think, because number of classes (for education, for example) is very low, you don't get samples in your validation set with some labels - in your "y_true" there are not 'education', 'rent/mortgage' at all as I see. Use "stratify" option in your train/test split.

But... Still strange, that you have 11 classes in train set - should be 13. check your lebel encoding as well

8 Jul 2022, 12:11
Upvotes 1

Yeah, we have an imbalanced distribution here

Thank you @serg132003. I will try implementing stratify