Challenge Chat: Neural Networks vs Boosting Me...

MPEG-G Microbiome Classification Challenge

$5 000 USD

Completed (10 months ago)

Skills you will learn

Classification

Federated Learning

Python

Deep Learning

797 joined

83 active

Info Data Chat Leaderboard

Start

Jun 20, 25

Sep 15, 25

Reveal

Sep 15, 25

Souptik_Mallick

Neural Networks vs Boosting Methods

Platform · 9 Sep 2025, 08:09 · 15

Those who are having high public score, including us, are using any of the boosting classifiers, not any neural network.

How are we supposed to make Federated Learning model using that?? Otherwise no one will be in top 15. Neural Network cannot match the accuracy of the State-of-the-art models.

any solutions??

Discussion 15 answers

Souptik_Mallick

Even our team is also getting a high accuracy of 3.16e-7 but that is considered as an invalid score, though it was not hardcoded or post-processed. The 'valid' score according to the Zindi can only be possible if everyone uses a NN model from scratch. There are high chances of getting rejected during the time of code review as it states to use pytorch model

9 Sep 2025, 08:13

Upvotes 0

CodeJoe

Yes from what I'm seeing you definitely need to use a neural net.

replied to Souptik_Mallick9 Sep 2025, 09:00

Upvotes 0

51pegasi

Where is that indicated? Because I suspect most of the top LB (if not all) are using boosting models

replied to CodeJoe9 Sep 2025, 09:04

Upvotes 0

Souptik_Mallick

Yes I also am concerned that to be in the top LB, we have to use boosting models which is not mentioned in the info of this challenge.

replied to 51pegasi9 Sep 2025, 09:10

Upvotes 0

CodeJoe

Your solution must include:

A centralised model, trained using the combined dataset.

A federated learning approach, trained using the split datasets.

The aim is to prove that MPEG-G files are more efficient due to less data movement and better compression, and that federated learning preserves performance across sites better than a centralised model. You must:

Define Clients: Treat each folder (or grouping by participants/site) as a "location".

Create a Federated Dataset Loader: Each client should have its own dataset split (e.g., location1_data, location2_data, ...) and convert each into a PyTorch Dataset.

Define the Same PyTorch Model: Same architecture as centralized model.

Implement FL with Flower or PySyft

Train the FL Model

Make a submission to Zindi

In your final solution submission , you will need to output three submission files, one from your centralised model, one from your federated learning model and a technical explanation of the innovation and approach used. You will also need to indicate training time, inference time, resources used and any observations on .mgb vs uncompressed formats.

replied to 51pegasi9 Sep 2025, 09:12

Upvotes 0

51pegasi

But we can also apply Federated learning to xgboost for example (although not sure about the performance). So why the restriction

replied to CodeJoe9 Sep 2025, 09:16

Upvotes 0

CodeJoe

Can't tell, let's confirm from the organizers😅.

replied to 51pegasi9 Sep 2025, 09:18

Upvotes 1

Koleshjr

Multimedia university of kenya

If you are getting this score: 3.16e-7 without post processing you should definitely submit it as it is valid

replied to Souptik_Mallick9 Sep 2025, 10:00

Upvotes 0

Souptik_Mallick

But the thing is, it's a DT based model. Federated learning using that is complex.

Also the competition doesnot want us to use any other model. It mentioned to use pytorch. That's why so much confusion. :)

replied to Koleshjr9 Sep 2025, 10:03

Upvotes 0

Koleshjr

Multimedia university of kenya

federated learning is a must! but I have seen an xgboost federated learning implementation. You can try that. Also for that score what is your local cv?

replied to Souptik_Mallick9 Sep 2025, 10:11

Upvotes 0

CodeJoe

You can retrain a boosting model.

9 Sep 2025, 08:52

Upvotes 0

Souptik_Mallick

Boosting models are ready, as well as the NN. But the performance between these two are quite large. No one is yet sure how the final evaluation will be done during code checking.

Will those codes having boosting models be accepted as a valid one?

replied to CodeJoe9 Sep 2025, 09:12

Upvotes 1