Primary competition visual

MPEG-G Microbiome Classification Challenge

$5 000 USD
Challenge completed ~1 month ago
Classification
Federated Learning
Python
Deep Learning
769 joined
83 active
Starti
Jun 20, 25
Closei
Sep 15, 25
Reveali
Sep 15, 25
About

Each row in the dataset represents a single sample collected from one of four body sites, "Nasal", "Mouth", "Stool", "Skin".

The MPEG-G files contain compressed fastq files of 16S rRNA sequence profile features.

The data is split into 2 901 train and 1 068 test samples.

You are provided with data in this format. You will need to create 4x folders based on the four body sites for you federated learning solution.

The data is a stratified split by participant and insulin sensitivity.

Please note that not all samples have cytokine information provided.

Files
Description
Files
These are the test MPEG-G Files.
These are the train MPEG-G Files.
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
Train contains the target. This is the dataset that you will use to train your model.
Information about the subjects in train.
This is a starter notebook to help you visualise and extract the MPEG-G files.
This starter notebook helps you build a model and make your first submission to the leaderboard.
Variable definitions for the Train_Subjects file.
Deep Learning Glossary for Genomics Experts
How to set up your environment on Mac.
MPEG-G Glossary
How to set up your environment on Linux.
Microbiome Basics for AI and Non-Bio Experts.
Microbiome AI MPEG-G Challenge Introduction Slides.
Introduction to MPEG-G.
Deep Learning Introduction session.
Cytokine profiles, please note that not all samples have cytokine information provided.