🎙️ Hot Topic: How to beat my score fast

Swahili Audio Classification

Helping Kenya

2 000 Points

Completed (over 3 years ago)

Skills you will learn

Classification

Automatic Speech Recognition

Natural Language Processing

206 joined

40 active

Info Data Chat Leaderboard

Start

Aug 07, 22

Nov 13, 22

Reveal

Nov 13, 22

Professor

Carnegie Mellon University Africa

How to beat my score fast

Notebooks · 8 Aug 2022, 18:20 · 2

Hi guys, we are seeing amazing scores already, a product of the hard work put into the just concluded hackathon🙇. There is still enough time to learn and work,💪 so I won't be sharing my notebook for the hackathon. Nevertheless, I'll share my findings, and tips on how you can beat my current score on the leaderboard.

0) Data is about 99.9% clean, for me I only found one, so there may be no need to look for wrongly labeled audio files.

1) Do good augmentations when converting to spectrograms, for me "Removing silence worked well"

2) You'll definitely want to ensemble with an ASR. The hugging face basic tutorial was all I needed for the hack. You can find the official tutorial here:

https://colab.research.google.com/github/huggingface/notebooks/blob/master/examples/audio_classification.ipynb

3) Ensembling diverse approaches worked better than ensembling the same model

4) FastAI's FastAudio approach has excellent voice configs used to generate spectrograms, which you may want to explore.

5) If you have no idea how to start, use the starter notebook. Nevertheless, I made a comprehensive tutorial weeks ago here for noise audio classification:

https://github.com/osinkolu/DataFest-Africa-Noise-Pollution-Classification-Challenge "if you find this repo insightful, don't forget to leave a star" 🌟

Expecting to see y'all on the leaderboard.....😉

Discussion 2 answers

Sodiq_Babawale_

University of ibadan

Thanks boss

8 Aug 2022, 21:17

Upvotes 1

Professor

Carnegie Mellon University Africa

Welcome boss

replied to Sodiq_Babawale_14 Aug 2022, 15:39

Upvotes 1

Join the largest network for
data scientists and AI builders

About FAQs

Status