Since the number of features is large ( 121 x 6 + few extra features), I thought I could lower the dimension by using PCA. But I see that we can't perform PCA over data containing Nan values. Any suggestions on how I can go forward without having to impute data?
I think you cannot perform a PCA if you have nans, since it is a basically a matrix factorization. One way or another, you'll have to impute missing values with a certain strategy
Yeah, makes sense, thanks pednt!