Primary competition visual

Your Voice, Your Device, Your Language Challenge

Helping Africa
1 000 CHF
Challenge completed ~1 month ago
Automatic Speech Recognition
Natural Language Processing
278 joined
73 active
Starti
Jul 22, 25
Closei
Sep 22, 25
Reveali
Sep 22, 25
User avatar
AIDOL
AiFlow
SWAHILI TEXT-TO-SPEECH IMPLEMENTATION NOTEBOOK (ORPHEUS ARCHITECTURE)
9 Sep 2025, 06:44 · 5

Hi all — I’m sharing a Jupyter notebook for Swahili Text-to-Speech (TTS) that I built. The notebook walks through data loading and preprocessing, model training, inference, and includes example synthesized audio that i pointed out in a previous chat. Everything needed to run the notebook (install commands/requirements) is in the first cell — no hidden dependencies. There are output audios embedded in it already

Quick run steps:

  1. Open the notebook and run the top cell to install dependencies.
  2. Point the data loader to your Swahili dataset (instructions included). This uses a subset of commonvoice
  3. Run the training/inference cells to reproduce the example outputs (sample audio is included).

link to collab Notebook here: https://colab.research.google.com/drive/1v3A1BsmiyO7bmFPkt-ZoRE-khazjHmoK?usp=sharing

I welcome feedback, issues, and collaborators — please reply here or DM me if you want to test, improve, or adapt it for other African languages.

Discussion 5 answers
User avatar
nymfree

Very interesting. Thanks for sharing. You used this to generate more training data I guess. Any hint as to how you handled augmentations? It feels like test data is not as clean as the commonvoice datasets.

9 Sep 2025, 07:23
Upvotes 1
User avatar
AIDOL
AiFlow

Hello nymfree

The notebook is strictly TTS, obviously could be used to synthesis more data points and infact done across distribution with added noise to be close enough to what we have in the test set. However, that have not been done in this. It's basically just a TTS generation pipeline.

Thanks brother For sharing

User avatar
AIDOL
AiFlow

You most welcome brother

Hey scholar

brilliant work, I will for sure commence from it

how can the TTS synthesis be made for a consistent voice. as in sytleTTS

10 Sep 2025, 06:36
Upvotes 0