☎️ AI in Focus: Zindi ML Live – Kiswahili ASR...

Your Voice, Your Device, Your Language Challenge by ITU

Helping Africa

1 000 CHF

Completed (10 months ago)

Skills you will learn

Automatic Speech Recognition

Natural Language Processing

334 joined

73 active

Info Data Chat Leaderboard

Start

Jul 22, 25

Sep 22, 25

Reveal

Sep 22, 25

Koleshjr

Multimedia university of kenya

Zindi ML Live – Kiswahili ASR (Part 6): Data Prep for Wav2Vec

Platform · 12 Aug 2025, 20:08 · 0

Today’s focus was data preparation for Wav2Vec finetuning — making sure our audio and transcripts are clean, properly formatted, and ready for the model.

We went through:

Discussing “what’s next” for CTC finetuning — maybe training from scratch
Downloading datasets via Hugging Face
Preprocessing audio and transcripts
Structuring data for Wav2Vec (input_ids & labels)

Data prep is the foundation and today we poured a solid one.

Replay: https://youtu.be/CXYp2YoTMO4?si=xxJSNehXPy0uzqV8 Live Schedule: https://www.twitch.tv/koleshjr/schedule

📢 Subscribe: https://www.youtube.com/@koleshjr

Discussion 0 answers

Join the largest network for
data scientists and AI builders

About FAQs

Status