I don't understand - train accuracy is higher then 90, validation accuracy close to 90, curve is coverging nice, BUT test accuracy - lower then 80! Why is that? Is test data differs that much from train? How to fight that? Thanks for any ideas.
Yes, I understand that. but ussually we evaluate our model on validation test, taking from train, considering distribution of features the same in train and test. Such big difference shows that test sreatures structure actually differs from train. It was shown by Humza, that, for example, test set doesn't include some amino acids (X, U), that are present in train. But I beleive it it not the only reason.
train creature : (array(['creature9', 'creature3', 'creature8', 'creature4', 'creature0', 'creature2', 'creature5', 'creature1'], dtype=object), test creature : array(['creature7', 'creature6'], dtype=object))
I think it can explain the difference
Yes, I understand that. but ussually we evaluate our model on validation test, taking from train, considering distribution of features the same in train and test. Such big difference shows that test sreatures structure actually differs from train. It was shown by Humza, that, for example, test set doesn't include some amino acids (X, U), that are present in train. But I beleive it it not the only reason.