Primary competition visual

TechCabal Ewè Audio Translation Challenge

$1 000 USD
Challenge completed ~1 year ago
Classification
Automatic Speech Recognition
267 joined
80 active
Starti
Aug 26, 24
Closei
Sep 29, 24
Reveali
Oct 10, 24
Missing audio files
Data · 30 Aug 2024, 11:54 · 12

When iterating through the files, I found out that some of the file ids both the train and test csv files have no audio files available. This is despite the fact that the combined number of rows in both the csv files corresponds with the total number of audio files

Discussion 12 answers

You can see the bold audio link in the Data tab, just before the download link for the full data set.

30 Aug 2024, 11:59
Upvotes 0
User avatar
AhmedMohamed365

All audio files exist , check the files again

30 Aug 2024, 12:00
Upvotes 1

in this test file is also included

User avatar
Origin

@Chineme4 In fact I think that after downloading the file on drive given the size it splits it into two folders and so like me no doubt you took the first folder like the train and the second which contains less audio like the test. but in fact it is google who split it so you will have to merge the two folders and you will see that there is no lack of source.

31 Aug 2024, 01:42
Upvotes 0

Yeah true, thanks a lot

Hello,

Thank you for your message.

Keep an eye on the google drive. The dataset will be updated soon,

Let us known if you still have this issue.

2 Sep 2024, 15:04
Upvotes 1

Could the train and test sets be separately uploaded and shared, I'm having lots of trouble with the dataset

User avatar
Mr_V

I get the same issue when downloading the files from the drive I never get all the audio files for both train and test folders. I am using chrome and edge but never get the full download.

4 Sep 2024, 11:33
Upvotes 1
User avatar
Mr_V

I built a selenium script to scroll to the end of the folder in the drive it then downloads all the files for the training audio files once it reaches the end, I did the same for test files but still not getting all files for the test set I get 660 files.

haha, yeah, I got the same number of test_2 wav files (660), and the train set contains 6288 wav files. I don't whether I got the whole dataset or not successfully. Compared to Train.csv and Test_1.csv, the rows number of them is still larger than what I have downloaded.

User avatar
Mr_V

I think the fix for me was that you just have to login with your gmail account. I couldn't do it that time was on the works network, once I got home and used my personal google drive account it pulled all the files.