In the sample_submission, within it's dataframe, contains a dictionary that compares user_I'd as a key to its equivalent value in the testing data dataframe. My question is: Are we using only testing data or the training data their?
Must it going to be like this>>>>>
pd.DataFrame({'user_id': test[user_Id]})
You're using the Test data for your final submission.