Sustainable Development Goals (SDGs): Text Classification Challenge
$1,000 USD
Classify text and documents by relevance to the 27 indicators of SDG #3 (Health and Well-Being)
5 September–12 November 2018 23:59
246 data scientists enrolled, 50 on the leaderboard
published 17 Oct 2018, 08:08
edited less than a minute later

i have been building the model and its doing quiet well but my challenge is in generating the submission file.

my model returns the idicators only for the articles provided within the testing dataset and from that i generate the submission file. basing on the out put of the test data, ofcourse does not have articles about all the indicator, there for my submission file has less indicators than the 27 required as the headings or column and when i try to submit it denies me that i ahve missing IDS. what could be the issues with my work. Any help pliz am junior datascientists no much experience.

edited less than a minute later

Hi Kazoza,

You should be generating predictions for every row and each of the 27 indicators in the test data. To build you submission you could try:

# some numpy array of predictions from model
Y = model.predict(X) # has Y.shape = (something, 27)
# columns names for output
cols = open('Devex_submission_format.csv').readline().replace('\n', '').split(',')  
# build output dataframe, Where index is the Unique ID col
output = pd.DataFrame(Y, index=X.index, columns=cols[1:])
# save for submission