Primary competition visual

GeoAI Challege Location Mention Recognition from Social Media by ITU

1 000 CHF
Challenge completed ~2 years ago
Prediction
Natural Language Processing
150 joined
28 active
Starti
Jul 19, 23
Closei
Oct 22, 23
Reveali
Oct 22, 23
User avatar
Sodiq_Babawale_
University of ibadan
Tweet_id Mismatch
Help Ā· 10 Oct 2023, 14:05 Ā· 10

Greetings guys,

I keep getting error while trying to submit my prediction to this competition. Upon checking through, I found out that 1000+ tweet_id in the Test dataset are not the same as the ones in the sample submission dataset. Seeing that many people have succesfully made a submission to the leaderboard, please, how are you able to resolve this?

Discussion 10 answers
User avatar
ML_Wizzard
Nasarawa State University

same here am getting these these error too

Missing entries for IDs ID_728597478946287616_loc17_end, ID_902603886535426049_loc4_start, ID_730464599670194178_loc9_end, ID_1176501926273474561_loc2_end, and ID_902861036470112256_loc2_start

10 Oct 2023, 15:28
Upvotes 0
User avatar
ML_Wizzard
Nasarawa State University

@moadel2002 how did you make your own submition

User avatar
Sartify LLC

You have to download the new current sample submission file. Zindi have updated it recently, so it has all the IDs required.

User avatar
Sartify LLC

To clarify more is, your submission file should be : There are about 4066 tweets for test data in all events, where the target is to find/prediction /recognize the start and end index for each location mentioned in each tweet. then put them into rows.

Example one : let's say certain tweet has two locations, this mean four rows for that locations, that is start and end index row for each locations.

Example two : for three locations in tweets, then this mean the 6 rows for that locations , meaning that start and end index row for each locations

But this competition has assumed that the maximum locations in each tweet is 17 so as to have consistent/predefined format in a submission file for all of us, so If a tweet has only has 1 location you will need to fill out the remaining locations with 0 this will give you 34 rows including two rows for start and end index of that 1 location and other 32 rows for start and end index of 16 locations that are technically 0 means also their start and end index rows are both 0 , if a tweet has 17 locations then you need to fill out all the rows per tweet, that's all 34, meaning start and end index row of each 17 Locations.

So, in simple math is, you have 4066 , this will produce 17 Locations each , that each locations should have start and end index rows. This will account to a total rows of (4066 × 17 × 2) = , that will be 138,244 rows in your submission file which are the same as Total numbers of rows in the given currently sample submission file by Zindi.

User avatar
ML_Wizzard
Nasarawa State University

Thank you for ur respond @Inno_Charz but am still getting thesame error any assistance in resolving this issue would be greatly appreciated.

User avatar
Sodiq_Babawale_
University of ibadan

Thank you @Inno_Charz. I understand the explanation you gave. It is exactly the same step I followed. But, extracting the tweet ids from the sample submission file, we are expected to get 4066 unique id, which I did get. It happens that some of the 4066 unique id (more than 1000) in the sample submission data is different from some of the 4066 unique id in the test data.

User avatar
Sartify LLC

How many rows does that sample submission file you use there has ? , try check it and give me feedback please

User avatar
Sartify LLC

Check total number of rows in that sample submission file you use just before taking it's Tweet_Id

User avatar
Sodiq_Babawale_
University of ibadan

The number of rows is 138244

User avatar
Sodiq_Babawale_
University of ibadan

@Milind, @quokka, @moadel2002, can you please help on how you resolved this issue?

12 Oct 2023, 10:24
Upvotes 0