The data was collected by 300 taskers from Kenya. There are twelve classes, each a different audio utterance in Swahili.
The objective of this competition is to classify the 12 different Swahili words, using machine learning or deep learning algorithms.
Here are the 12 words and their English translations. You need to predict the Swahili word, the English is here for interest's sake.
Files available for download:
Audio.zip: is a zip file that contains all audio in test and train.
Train.csv: contains the target. This is the dataset that you will use to train your model.
Test.csv: resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
SampleSubmission.csv: shows the submission format for this competition, with the ‘Audio_ID’ column mirroring that of Test.csv and the ‘label’ column containing your predictions. The order of the rows does not matter, but the names of the ‘Image_ID’ must be correct.