How's it going with the challenge? Who has started collecting data already and in what languages? What tools are you using? Any questions or tips to share?
This is too late to respond but I joined the competion a littlee late. I have been collecting data for Chichewa language, its spoken in Malawi and parts of Zambia, Mozambique and Zimbabwe. I have some questions about ASR models, do you have anyone who can help me with that?
Glad to hear you joined in and really cool that you're collecting for Chichewa! We have office hour appointments available tomorrow, you can sign up here. The caveat is that there will be linguists who are helping out, so the modelling expertise may be limited. If you could try your best to summarize your question about models, I can try to find someone else that might be able to answer -- I would also recommend posting on the general discussion board of the competition and see if anyone else could help answer.
Side note, if you're stretched for time, I would focus more on the data collection deliverables and know the ASR model is optional. The Evaluation section of the competition goes into more details about submission guidelines, and i'll paste some of the info about the model bonus below as FYI:
"(Optional) ASR model bonus: The inclusion of an ASR model (that you train on the data you collected) is offered as a small “bonus point” opportunity. You are encouraged, but not required to submit an ASR model as part of your submission. To help you prioritize, please focus on the rest of the submission deliverables (dataset and related files). This bonus opportunity serves more as a tie breaking differentiator - if two dataset submissions are tied, then the submission that came with an ASR model would rank over the other submission. If you choose to include an ASR model, please know that judges will not look at the performance of your ASR model, they’ll only check Yes/No if a working model was included. (This is because there’s no way to compare model performance when your training data differs. Judges just want to see that you demonstrated the ability to train a model on your data.) Here’s a video on how to train an ASR model in a beginner-friendly way."
Anything you have questions about, don't hesitate to reach out :)
Thanks @Connie. I have signed up for the office hours tomorrow. I will still summarize my questions here:
1. Letter to sound rules. I still cant find a good example file for English which I can us. Therefore, this is still an outstanding question for me.
2. Audio file duration. I know that short audios (30 seconds or less are recommended but are long form audios (e.g., 30 minutes) completely useless or we can still use them during training?
3. End to end models for ASR. I know that we for now the focus is using ELPIS or Kaldi, I still wanted to try end to end deep learning models as they are easy (pipeline wise). Is there any good tutorials on using tensorflow to build an end to end ASR model?
I managed to chat with Alena and Sandy. Thanks for helping with the office hours. Regarding submission, I'm making mine through Zindi emaill adress as per advice from @amyflorida626, this is because my files aree around 2GB. However, there is no direct confirmation that the submission file has been recieved. So, I just wanted to point this out in case things go south during the final submission hours.
Hi Dunstan, glad to hear you were able to chat with Alena and Sandy. Submission through email is okay. Amy is going to check in on all emails / submissions on Monday - she should reply then with confirmation as well. I'll check in with her as well and make sure she has your complete submission. Enjoy your weekend! :)
This is too late to respond but I joined the competion a littlee late. I have been collecting data for Chichewa language, its spoken in Malawi and parts of Zambia, Mozambique and Zimbabwe. I have some questions about ASR models, do you have anyone who can help me with that?
Hi dmatekenya,
Glad to hear you joined in and really cool that you're collecting for Chichewa! We have office hour appointments available tomorrow, you can sign up here. The caveat is that there will be linguists who are helping out, so the modelling expertise may be limited. If you could try your best to summarize your question about models, I can try to find someone else that might be able to answer -- I would also recommend posting on the general discussion board of the competition and see if anyone else could help answer.
Side note, if you're stretched for time, I would focus more on the data collection deliverables and know the ASR model is optional. The Evaluation section of the competition goes into more details about submission guidelines, and i'll paste some of the info about the model bonus below as FYI:
"(Optional) ASR model bonus: The inclusion of an ASR model (that you train on the data you collected) is offered as a small “bonus point” opportunity. You are encouraged, but not required to submit an ASR model as part of your submission. To help you prioritize, please focus on the rest of the submission deliverables (dataset and related files). This bonus opportunity serves more as a tie breaking differentiator - if two dataset submissions are tied, then the submission that came with an ASR model would rank over the other submission. If you choose to include an ASR model, please know that judges will not look at the performance of your ASR model, they’ll only check Yes/No if a working model was included. (This is because there’s no way to compare model performance when your training data differs. Judges just want to see that you demonstrated the ability to train a model on your data.) Here’s a video on how to train an ASR model in a beginner-friendly way."
Anything you have questions about, don't hesitate to reach out :)
Best of luck with the competition!
Thanks @Connie. I have signed up for the office hours tomorrow. I will still summarize my questions here:
1. Letter to sound rules. I still cant find a good example file for English which I can us. Therefore, this is still an outstanding question for me.
2. Audio file duration. I know that short audios (30 seconds or less are recommended but are long form audios (e.g., 30 minutes) completely useless or we can still use them during training?
3. End to end models for ASR. I know that we for now the focus is using ELPIS or Kaldi, I still wanted to try end to end deep learning models as they are easy (pipeline wise). Is there any good tutorials on using tensorflow to build an end to end ASR model?
Thanks,
Dunstan
Hi Connie,
I managed to chat with Alena and Sandy. Thanks for helping with the office hours. Regarding submission, I'm making mine through Zindi emaill adress as per advice from @amyflorida626, this is because my files aree around 2GB. However, there is no direct confirmation that the submission file has been recieved. So, I just wanted to point this out in case things go south during the final submission hours.
Dunstan
Hi Dunstan, glad to hear you were able to chat with Alena and Sandy. Submission through email is okay. Amy is going to check in on all emails / submissions on Monday - she should reply then with confirmation as well. I'll check in with her as well and make sure she has your complete submission. Enjoy your weekend! :)