Primary competition visual

Gender-Based Violence Tweet Classification Challenge

Helping Global
2000 Points
Challenge completed almost 4 years ago
Natural Language Processing
Classification
634 joined
140 active
Starti
Aug 09, 21
Closei
Nov 14, 21
Reveali
Nov 14, 21
How to surpass the error "Missing entries for IDs ID_0095QL4S, ID_00E9F5X9 ... and more"?
6 Sep 2021, 21:40 · edited ~17 hours later · 5

This must be a very stupid error and I'm actually quite embarrassed of asking, but I have no more alternatives (lol)... I'm trying to send my final CSV. I tried submitting a very simple CSV file, containing only two columns: Tweet_ID and type. The file contains 15581 rows, just like the Test set should be. However, I keep getting the error "Missing entries for IDs ID_0095QL4S, ID_00E9F5X9, ID_00HU96U6, ID_00IJ4SAW, ID_00N24MZN and more". This is weird, because the IDs are in my file. So I'm really lost.

Anyone could throw some light on this issue?

UPDATE: I finally understood. The separator should be a ',', not a ';'. Thank you guys for the support!

Thanks in advance!

Discussion 5 answers

make sure you include index=False when saving the csv file

6 Sep 2021, 21:43
Upvotes 0

I did that! When I first sent with the index, I was sure that removing it would work, but it didn't..

User avatar
Faculty of sciences of monastir

If you have created the output_submission dataframe on your own with pd.DataFrame() ... try to do output_submission = sample_submission.copy() and fulfill the target column with your predictions i.e output_submission["target"] = predictions_array ...

6 Sep 2021, 23:37
Upvotes 0

What I've done: imported the submission and test file, then added a column with the prediction in the submission file (I named the new column "type"). After that, I concatenated the test set (that I've imported with a simple pd.read_csv()), just like this: pd.concat([test, pd.DataFrame({"type": submission_pred})], axis=1). After that, I exported a csv file with the columns Tweet_ID and type. I'm not 100% sure I understood your advice. Should I make a copy of the pd.concat() dataframe created?

User avatar
Faculty of sciences of monastir

try to not involve the test dataset when creating the submission file, exp:

sample_submission = pd.read_csv('/path/SampleSubmission.csv')

sample_submission["type"] = submission_pred

sample_submission.to_csv('GBV_submission.csv', index=False)

7 Sep 2021, 15:17
Upvotes 0