Hi all — I’m sharing a Jupyter notebook for Swahili Text-to-Speech (TTS) that I built. The notebook walks through data loading and preprocessing, model training, inference, and includes example synthesized audio that i pointed out in a previous chat. Everything needed to run the notebook (install commands/requirements) is in the first cell — no hidden dependencies. There are output audios embedded in it already
Quick run steps:
link to collab Notebook here: https://colab.research.google.com/drive/1v3A1BsmiyO7bmFPkt-ZoRE-khazjHmoK?usp=sharing
I welcome feedback, issues, and collaborators — please reply here or DM me if you want to test, improve, or adapt it for other African languages.
Very interesting. Thanks for sharing. You used this to generate more training data I guess. Any hint as to how you handled augmentations? It feels like test data is not as clean as the commonvoice datasets.
Hello nymfree
The notebook is strictly TTS, obviously could be used to synthesis more data points and infact done across distribution with added noise to be close enough to what we have in the test set. However, that have not been done in this. It's basically just a TTS generation pipeline.
Thanks brother For sharing
You most welcome brother
Hey scholar
brilliant work, I will for sure commence from it
how can the TTS synthesis be made for a consistent voice. as in sytleTTS