Hello,
The test and train data available have let different number of columns. After modeling and applying a machine learning algorithm, when I try to predict, I get an error that says the test data (6 columns) has fewer columns than the train (about 45 columns).
Please what can I do in this situation?
Hi @Jimal, you need to add the missing cols before applying your model. recall the final model will be expecting the same number of constants you used to build the model, then apply weights before predicting ~ hope that helps
Thank you for the response.
if I'm to add 30+ column to the test data, what will be the values of each column. I feel it's not cool to come up with random values for each columns.
I think that's the challenge. I haven't not had the time to model as yet on this challenge but I think that's the key
Okay
Thank you
Hello Jimal,
This is a forecasting challenge, majority of the information in train is target related information or by products /breakdown of target information (selliin). Hence, these are all unknown features at test time, they are provided for you to build a profile/representation of the pattern of distribution for the product of interest, such that given only a month, year and product name, you are able to infer future forecasts. You need to structure the dataset such that given database information, there is a corresponding sellin target to be predicted or apply a time series model, sequence to sequence model to represent the sellin trend of the dataset.
I hope this helps.
Yes. To an extent. Thank you
Thinking about it again, your advice gave me an idea.
Thank you
Awesome, all the best!