Social Media Prediction Challenge
$1,000 USD
Which tweets from these major African companies will get the most retweets?
4 September–25 November 2018 23:59
265 data scientists enrolled, 28 on the leaderboard
published 24 Oct 2018, 10:52
edited less than a minute later

Having trouble submitting, my train set has 96562 obs and test set 32210 obs. am getting an error Missing entries for IDs 958279977031602176, 1027109111924711424, 1028207216933969920, 1004380870998941698, 1003949351775895553 and more, could anyone share with me what their test set observations are in number/ send me a json will of the test set value. thanks

You can open the test file in Excel and upload it as text (Open a Blank Sheet, Data Tab --> From Text, select your file, and choose the string option) .

Some platforms (including Excel) tend to forego values at a certain precision when you open them up the 'normal' way.

Using linux, that option doesnt work. let me just transfer my files to a windows machine. thanks

@odartey, after choosing the file am not seeing the string option. which version of ms are you using/ where is that option located am new to windows

When you choose the file, it opens up a Text Import Wizard for you, where you select the kind of data type you want, the delimiter you want to use and then the column format, which is "Text" (not string, sorry) . I'm using MS Office 2016

If you are using Excel 2016, import your data using the Power Query tool available in Excel. With this tool you can even import Json file and transform your data whithin the Power Query Editor

that option only removes few am still having the same error. thanks so much

After predicting the re_tweets using the test question data, save a new dataset on the machine that has "id" and "retweet_count" columns only, the following code

model.to_csv("tweets.csv", index = False) # Then upload directly without need for opening in excel