🦺 Let's Talk About: How to surpass the error Missi...

Gender-Based Violence Tweet Classification Challenge

Helping Global

2000 Points

Completed (over 4 years ago)

Skills you will learn

Natural Language Processing

Classification

637 joined

140 active

Info Data Chat Leaderboard

Start

Aug 09, 21

Nov 14, 21

Reveal

Nov 14, 21

yukioandre

How to surpass the error "Missing entries for IDs ID_0095QL4S, ID_00E9F5X9 ... and more"?

6 Sep 2021, 21:40 · edited ~17 hours later · 5

This must be a very stupid error and I'm actually quite embarrassed of asking, but I have no more alternatives (lol)... I'm trying to send my final CSV. I tried submitting a very simple CSV file, containing only two columns: Tweet_ID and type. The file contains 15581 rows, just like the Test set should be. However, I keep getting the error "Missing entries for IDs ID_0095QL4S, ID_00E9F5X9, ID_00HU96U6, ID_00IJ4SAW, ID_00N24MZN and more". This is weird, because the IDs are in my file. So I'm really lost.

Anyone could throw some light on this issue?

UPDATE: I finally understood. The separator should be a ',', not a ';'. Thank you guys for the support!

Thanks in advance!

Discussion 5 answers

Nereus

make sure you include index=False when saving the csv file

6 Sep 2021, 21:43

Upvotes 0

yukioandre

I did that! When I first sent with the index, I was sure that removing it would work, but it didn't..

replied to Nereus6 Sep 2021, 22:00

Upvotes 0

AbdelAli_Bouallegui

Faculty of sciences of monastir

If you have created the output_submission dataframe on your own with pd.DataFrame() ... try to do output_submission = sample_submission.copy() and fulfill the target column with your predictions i.e output_submission["target"] = predictions_array ...

6 Sep 2021, 23:37

Upvotes 0

yukioandre

What I've done: imported the submission and test file, then added a column with the prediction in the submission file (I named the new column "type"). After that, I concatenated the test set (that I've imported with a simple pd.read_csv()), just like this: pd.concat([test, pd.DataFrame({"type": submission_pred})], axis=1). After that, I exported a csv file with the columns Tweet_ID and type. I'm not 100% sure I understood your advice. Should I make a copy of the pd.concat() dataframe created?

replied to AbdelAli_Bouallegui7 Sep 2021, 14:20

Upvotes 0

AbdelAli_Bouallegui

Faculty of sciences of monastir

try to not involve the test dataset when creating the submission file, exp:

sample_submission = pd.read_csv('/path/SampleSubmission.csv')

sample_submission["type"] = submission_pred

sample_submission.to_csv('GBV_submission.csv', index=False)

7 Sep 2021, 15:17

Upvotes 0

Join the largest network for
data scientists and AI builders

About FAQs

Status