I see a lot of values in train having something +AC. Is it some error while producing or cleaning the data which we need to remove. In this case sample submission also has the first column with AC. But if you see test, the first column ward does not have this AC thing. So do we need to submit with this AC thing or not ?
I also saw that . even though i am trying to overlook it but the columns dont match at all with statistical inference.. will check again tho
so what do you suggest to be done as a way of cleaning the data
I tried just now and even the submissions have to be done in the +AC- format. The values having this AC+ format are actually very few in the training data, so for now setting them to NaN. Also if you remove the target column, train and test have the same order of columns, so you can change the name of train columns to test or vice versa to make it easy for you
Thanks devnikhilmishra for bringing this to our attention. We have updated the data files to address the issue. Sorry about that!
Thanks for clarifying, I had gone as far as renaming
You have to rename them. If you look closely, the columns names are the same with the test columns only that you have to remove the +AC... characters around them.
Thanks devnikhilmishra for bringing this to our attention. We have updated the data files to address the issue. Sorry about that!