Primary competition visual

DSN Pre-Bootcamp Hackathon: Expresso Churn Prediction Challenge by Data Science Nigeria

Helping Nigeria
Knowledge
Completed (over 5 years ago)
Classification
Prediction
671 joined
358 active
Starti
Aug 08, 20
Closei
Aug 22, 20
Reveali
Aug 22, 20
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
Data · 12 Aug 2020, 09:32 · 10

Please Friends, i get this error after running a classification on the train.csv, how can I remove this error. Thanks

x=np.array(df_train.drop(['user_id','CHURN'],1))

x=preprocessing.scale(x)

y=np.array(df_train['CHURN'])

x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.2)

clf=LinearRegression(n_jobs=-1)

clf.fit(x_train,y_train)

accuracy=clf.score(x_test,y_test)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

Discussion 10 answers
User avatar
Lagos State University

This is a classification problem.

12 Aug 2020, 09:35
Upvotes 0
User avatar
Federal University of Technology Akure

You have to handle missing values(Nans), you can do that by filling with mean, median or mode or you can use an arbitrary value.

12 Aug 2020, 09:37
Upvotes 0

Did you handle the missing values before running the regression ?

12 Aug 2020, 09:37
Upvotes 0

Are you running it on your local machine ?

If yes

Did you also handle the NaNs very well?

PS: I would suggest using Colab to run these kinds of large datasets

12 Aug 2020, 09:59
Upvotes 0

...also I dont think you can use Linear Regression for a Classification Problem

12 Aug 2020, 11:07
Upvotes 0

i am experiencing the same error. Even when I have data.replace('?',-99999, inplace=True) for my missing values

12 Aug 2020, 14:17
Upvotes 0

That won't work because your missing values aren't entered as "?". Use data.fillna instead

Thanks. it solved it

You are welcome.👍

Thanks everyone for your clarification, they we're really helpful.😃

13 Aug 2020, 12:40
Upvotes 0