Having trouble merging the 9 mapping files so i can add features to the train dataset. Tried using os module in python but running into encoding errors. Anyone with a a work around?
I hope that you was able to read the files, there are 3 kind of features that will help you a lot to boost your score , you can try to use knn with coordinates as input to predict the district and region from external data
I'm facing a similar problem
use pandas and "ISO-8859-1" encoding.
like this:
pd.read_csv('FSDT_FinAccessMapping/3rd_ppp_for_upload_win.csv', encoding="ISO-8859-1")
In general when I run into encoding errors, it's usually because of some strange characters in the dataset often found in people's names etc.
A quick way to get around this is to use:
pd.read_csv(fpath, encoding='latin')
I like to use latin because I can't always remember all the ISO codes and then would have to google around a bit, while latin usually just works ;)
I hope that you was able to read the files, there are 3 kind of features that will help you a lot to boost your score , you can try to use knn with coordinates as input to predict the district and region from external data