InstaDeep Enzyme Classification Challenge
Job Interview
Can you predict the class of an enzyme using only its amino acid sequence?
346 data scientists enrolled, 70 on the leaderboard
17 November 2020—21 February 2021
97 days
13th place summary
published 23 Feb 2021, 03:12

The approach was simple. Tokenized using keras tokenizer at a character level. Used an Embedding layer with each amino acid being represented in 100 dimensions. This was followed by a 1D Convolution with 4096 filters and then followed by concatenation of global max pool and global avg pool. This was fed to a Dense layer for final classification. Had to use TPU since dataset size was huge. Will share the code soon.