Having trouble merging the 9 mapping files so i can add features to the train dataset. Tried using os module in python but running into encoding errors. Anyone with a a work around?
I'm facing a similar problem
use pandas and "ISO-8859-1" encoding.
In general when I run into encoding errors, it's usually because of some strange characters in the dataset often found in people's names etc.
A quick way to get around this is to use:
I like to use latin because I can't always remember all the ISO codes and then would have to google around a bit, while latin usually just works ;)
I hope that you was able to read the files, there are 3 kind of features that will help you a lot to boost your score , you can try to use knn with coordinates as input to predict the district and region from external data