Zindi ML Live: Kiswahili ASR – Exploring Speech-to-Text, Baselines & Model Errors (Part 1)
In this stream, we kicked off a brand new challenge: building a Kiswahili speech-to-text (ASR) system for real-world, low-resource environments.
🧠 Here's what we did:
- Broke down what ASR is for beginners
- Explored the Zindi competition and the real-world impact of offline voice tech
-
Researched past ASR competitions and stalked a winning solution (spoiler: it was mine 😅)
- Created a training-free baseline by hunting for pre-trained Swahili models on HuggingFace
- Let AI write the code for us to simulate a beginner workflow
- Faced a hilarious number of bugs (oops) but got it working in the end
- Landed a Top 5 leaderboard position with zero training 🎉
FYI: The 3rd Placed sub is not shown on the stream but it is a hunted model as well(So keep hunting haha)
Tomorrow we explore: conformer-ctc-asr baselines
📺 Watch the full replay here:
https://youtu.be/rZuEj6JBZOY?si=QDg2S_5qzqD714cX
📌 Subscribe to catch future ML streams:
https://www.youtube.com/@koleshjr
📆 Join live sessions (Tues/Wed/Thurs):
https://www.twitch.tv/koleshjr/schedule