The challenge dataset stems from a high-density peptide microarray experiment that aimed to address how cross-reactive 8 different commercially available snake antivenoms are and where in the toxin sequence the antibodies they contain bind the toxin (epitope).
Each row in the dataset represents a k-mer (16 amino acid sequence within the toxin) and it has a signal column coming from the high-density peptide microarray experiment.
You will need to predict the signal column generated by a given Toxin_K_mer and Antivenom. You can use any other column available in the test set to enhance your predictions or enrich your data. We also facilitate the protein prot_bert embeddings for each row.
Watch this video, it is a walk-through of the starter notebook and the relevance of the challenge.
You can also view it here.
The data you are presented with includes the following columns:
Join the largest network for
data scientists and AI builders