Primary competition visual

Akeed Restaurant Recommendation Challenge

Helping Oman
$3 000 USD
Completed (over 5 years ago)
Prediction
Collaborative Filtering
1420 joined
242 active
Starti
May 18, 20
Closei
Aug 16, 20
Reveali
Aug 16, 20
Is anyone else getting large gap between Validation and Testing Accuracy?
Help · 9 Jul 2020, 16:35 · 4

I am doing validation on about 1,400,000 samples and getting 0.2 F1 Score, but when I predict the Test set and submit it, my score is very low, about 0.01. Is anyone else having a similar problem?

Discussion 4 answers

Did you use any resampling techniques?or target based features ?

I extracted test 'customer_id' and 'vendor_id' features from 'CID X LOC_NUM X VENDOR' SampleSubmission column, by splitting the strings using ' X ' as delimiter. Train dataset had customer_id' and 'vendor_id' features already available. I then combined both train and test dataset that was derived from SampleSubmission file and factorized 'customer_id's by using pandas.factorize() function. I then finally split combined, factorized dataset back into train and test sets and used sklearn.train_test_split to get validation subset of training data.

I have the same problem

How do you split the Train/Validation?

I think the proper way to do it, is to split on customers (i.e. before merging with locations and before merging with vendors).

That might be a reason

10 Jul 2020, 07:51
Upvotes 0