The data is from two domains, healthcare (~60%) and general (~40%); general domain includes news, sports, entertainment, politics, and Wikipedia.
There are 196 hours of accented English recordings; audio clips are ~ 11 seconds on average and are from 13 different countries covering 120 accents from West, South, and East Africa.
There are 57 819 recordings in train, 3 227 in dev and 5 070 test.
Winning models should be submitted in original (pytorch, tensorflow, etc) and ONNX format for portability and to ease testing.g and bulky models
NOTE: The test audio files will only be uploaded on 19 May 2023, one week before the close of the challenge. The test files will be the private leaderboard and will constitute the final leaderboard for this challenge.
NOTE: The SampleSubmission.csv contains "audio_ids" for both dev and test, even though test audio files are not available until 19 May 2023. From the launch of the challenge to 19 May 2023 you need to submit your predictions for the dev audio_ids and submit "" for the test audio_ids, AFTER 19 May 2023 you need to submit your predictions for the dev audio_ids and submit your actual predictions for the test audio_ids that will be made available on 19 May 2023.
How to use Colab on Zindi
How to mount a drive on Colab