Primary competition visual

InstaDeep Enzyme Classification Challenge

Job Interview
Challenge completed almost 5 years ago
Classification
520 joined
70 active
Starti
Nov 17, 20
Closei
Feb 21, 21
Reveali
Feb 21, 21
Low test accuracy
Data · 15 Jan 2021, 19:57 · 2

I don't understand - train accuracy is higher then 90, validation accuracy close to 90, curve is coverging nice, BUT test accuracy - lower then 80! Why is that? Is test data differs that much from train? How to fight that? Thanks for any ideas.

Discussion 2 answers
User avatar
Kamenialexnea
Ecole nationale superieure polytechnique yaounde

train creature : (array(['creature9', 'creature3', 'creature8', 'creature4', 'creature0', 'creature2', 'creature5', 'creature1'], dtype=object), test creature : array(['creature7', 'creature6'], dtype=object))

I think it can explain the difference

13 Feb 2021, 02:59
Upvotes 0

Yes, I understand that. but ussually we evaluate our model on validation test, taking from train, considering distribution of features the same in train and test. Such big difference shows that test sreatures structure actually differs from train. It was shown by Humza, that, for example, test set doesn't include some amino acids (X, U), that are present in train. But I beleive it it not the only reason.