I'm having a problem on doing predictions on the test set. The training set has different numbers of columns from test set and after doing vectorization they become completely different. Whats the best approach to perfom predictions on the test set? should i restrict my training set data to the number of columns on the test set, but with this, will it work because after vectorizing both the test and train seperately, they form completely different columns?... kindly assist
Hi Geoffrey, check this for some hints.
https://zindi.africa/hackathons/to-vaccinate-or-not-to-vaccinate-its-not-a-question/discussions/1167
Thanks very much