Please Friends, i get this error after running a classification on the train.csv, how can I remove this error. Thanks
x=np.array(df_train.drop(['user_id','CHURN'],1))
x=preprocessing.scale(x)
y=np.array(df_train['CHURN'])
x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)
clf=LinearRegression(n_jobs=-1)
clf.fit(x_train,y_train)
accuracy=clf.score(x_test,y_test)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
This is a classification problem.
You have to handle missing values(Nans), you can do that by filling with mean, median or mode or you can use an arbitrary value.
Did you handle the missing values before running the regression ?
Are you running it on your local machine ?
If yes
Did you also handle the NaNs very well?
PS: I would suggest using Colab to run these kinds of large datasets
...also I dont think you can use Linear Regression for a Classification Problem
i am experiencing the same error. Even when I have data.replace('?',-99999, inplace=True) for my missing values
That won't work because your missing values aren't entered as "?". Use data.fillna instead
Thanks. it solved it
You are welcome.👍
Thanks everyone for your clarification, they we're really helpful.😃