☎️ Hot Topic: Starter Notebook: Achieving 0....

Your Voice, Your Device, Your Language Challenge by ITU

Helping Africa

1 000 CHF

Completed (10 months ago)

Skills you will learn

Automatic Speech Recognition

Natural Language Processing

334 joined

73 active

Info Data Chat Leaderboard

Start

Jul 22, 25

Sep 22, 25

Reveal

Sep 22, 25

ML_Wizzard

Nasarawa State University

Starter Notebook: Achieving 0.48 Score with SeamlessM4T for Audio Transcription

4 Aug 2025, 11:23 · 6

A starter guide for audio transcription using the SeamlessM4T model, tailored to the Sartify ITU Zindi Test Dataset. It demonstrates how to load, play, and process audio files, perform ASR in Swahili ('swh'), and generate a submission file scoring around 0.48 on the Zindi leaderboard.

The notebook includes:

Installation of required libraries (fairseq2, pydub, sentencepiece, and seamless_communication).
Loading and preprocessing audio data from the dataset.
Utilizing the SeamlessM4T medium model with the vocoder_36langs for transcription.
Batch processing of audio files to optimize performance.
Post-processing of predictions to create a submission-ready CSV file.

https://github.com/mubrij/Your-Voice-Your-Device-Your-Language-Challenge

Discussion 6 answers