i have been building the model and its doing quiet well but my challenge is in generating the submission file.
my model returns the idicators only for the articles provided within the testing dataset and from that i generate the submission file. basing on the out put of the test data, ofcourse does not have articles about all the indicator, there for my submission file has less indicators than the 27 required as the headings or column and when i try to submit it denies me that i ahve missing IDS. what could be the issues with my work. Any help pliz am junior datascientists no much experience.
Hi Kazoza,
You should be generating predictions for every row and each of the 27 indicators in the test data. To build you submission you could try:
# some numpy array of predictions from model Y = model.predict(X) # has Y.shape = (something, 27) # columns names for output cols = open('Devex_submission_format.csv').readline().replace('\n', '').split(',') # build output dataframe, Where index is the Unique ID col output = pd.DataFrame(Y, index=X.index, columns=cols[1:]) # save for submission output.to_csv('awesome_submission.csv')