MPEG-G Microbiome Classification Challenge

Zindi

Compete Jobs Learn Chat Leaderboard

More

For Business Partners Meet the team Press Case studies AI4EAC

MPEG-G Microbiome Classification Challenge

$5 000 USD

Completed (10 months ago)

Skills you will learn

Classification

Federated Learning

Python

Deep Learning

797 joined

83 active

Info Data Chat Leaderboard

Start

Jun 20, 25

Close

Sep 15, 25

Reveal

Sep 15, 25

About

Each row in the dataset represents a single sample collected from one of four body sites, "Nasal", "Mouth", "Stool", "Skin".

The MPEG-G files contain compressed fastq files of 16S rRNA sequence profile features.

The data is split into 2 901 train and 1 068 test samples.

You are provided with data in this format. You will need to create 4x folders based on the four body sites for you federated learning solution.

The data is a stratified split by participant and insulin sensitivity.

Please note that not all samples have cytokine information provided.

Files

Description

Files

These are the test MPEG-G Files.

These are the train MPEG-G Files.

Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.

Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.

Train contains the target. This is the dataset that you will use to train your model.

Information about the subjects in train.

This is a starter notebook to help you visualise and extract the MPEG-G files.

This starter notebook helps you build a model and make your first submission to the leaderboard.

Variable definitions for the Train_Subjects file.

Deep Learning Glossary for Genomics Experts

How to set up your environment on Mac.

MPEG-G Glossary

How to set up your environment on Linux.

Microbiome Basics for AI and Non-Bio Experts.

Microbiome AI MPEG-G Challenge Introduction Slides.

Introduction to MPEG-G.

Deep Learning Introduction session.

Cytokine profiles, please note that not all samples have cytokine information provided.

Join the largest network for
data scientists and AI builders

About FAQs

Privacy Policy Terms of Use Rules

Status