Primary competition visual

Swahili Audio Classification

Helping Kenya
2 000 Points
Challenge completed almost 3 years ago
Classification
Automatic Speech Recognition
Natural Language Processing
201 joined
40 active
Starti
Aug 07, 22
Closei
Nov 13, 22
Reveali
Nov 13, 22
User avatar
Professor
How to beat my score fast
Notebooks · 8 Aug 2022, 18:20 · 2

Hi guys, we are seeing amazing scores already, a product of the hard work put into the just concluded hackathon🙇. There is still enough time to learn and work,💪 so I won't be sharing my notebook for the hackathon. Nevertheless, I'll share my findings, and tips on how you can beat my current score on the leaderboard.

0) Data is about 99.9% clean, for me I only found one, so there may be no need to look for wrongly labeled audio files.

1) Do good augmentations when converting to spectrograms, for me "Removing silence worked well"

2) You'll definitely want to ensemble with an ASR. The hugging face basic tutorial was all I needed for the hack. You can find the official tutorial here:

https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/audio_classification.ipynb

3) Ensembling diverse approaches worked better than ensembling the same model

4) FastAI's FastAudio approach has excellent voice configs used to generate spectrograms, which you may want to explore.

5) If you have no idea how to start, use the starter notebook. Nevertheless, I made a comprehensive tutorial weeks ago here for noise audio classification:

https://github.com/osinkolu/DataFest-Africa-Noise-Pollution-Classification-Challenge "if you find this repo insightful, don't forget to leave a star" 🌟

Expecting to see y'all on the leaderboard.....😉

Discussion 2 answers
User avatar
Sodiq_Babawale_
University of ibadan

Thanks boss

8 Aug 2022, 21:17
Upvotes 1
User avatar
Professor

Welcome boss