Previous discussions such as https://zindi.africa/competitions/giz-nlp-agricultural-keyword-spotter/discussions/3399 have asked about the use of pre-trained models or external datasets. We clarified that participants should stick to the competition data or 'standard' options like the pre-trained imagenet models built into many libraries. However, we did mention that anyone with suggestions for other open-source resources they thought would be useful should get in touch.
Based on your suggestions, we're adding the following resources as allowable in the competition: - https://github.com/tensorflow/models/tree/master/research/audioset/vggish - https://github.com/qiuqiangkong/audioset_classification - https://github.com/qiuqiangkong/audioset_tagging_cnn
Note that with only a month left we will not be making further changes in the interest of fairness.
Good. So can we use audioset dataset also as you are giving us pretrained models on audioset ? Audioset is open source too.
Or we should use only the pre trained weight ?
You should only use pre-trained weights.
Moreover, the dataset is tricky to download. You have to download one by one (you only have the link of the youtube video) and have to extract then the audio. Some links are dead since then.
Great, thank you! What about DeepSpeech by Mozilla? It is open source too
I want to recommend add this - https://alphacephei.com/vosk/index
It is opensource library