Antibiotic resistance prediction dataset encompasses 17k protein sequences extracted from HMD-ARG-DB. Embeddings available calculated with esm1_t34_670M_UR100 (cls pooling type).
Antibiotic-resistance data were collected and cleaned from seven published ARG (antibiotic resistance gene) databases: Comprehensive Antibiotic Resistance Database (CARD), AMRFinder, ResFinder, Antibiotic Resistance Gene-ANNOTation (ARG-ANNOT), DeepARG, MEGARes, and Resfams. HMD-ARG-DB comprises 17,282 high-quality sequences with labels of 33 antibiotic classes, 7 underlying resistance mechanisms, and their respective gene names.