Primary competition visual

GIZ NLP Agricultural Keyword Spotter

Helping Uganda
$7 000 USD
Challenge completed ~5 years ago
Classification
Automatic Speech Recognition
Natural Language Processing
737 joined
253 active
Starti
Sep 11, 20
Closei
Nov 29, 20
Reveali
Nov 29, 20
Rules clarification - Additional datasets
Data · 10 Oct 2020, 12:15 · edited 12 minutes later · 5

the rules say "You may use only the datasets provided for this competition" but participants can use pretrained weights e.g "Imagenet". If I first train my model on not allowed (public) dataset and use the result as a checkpoint, will it be a rule violation?

I am sorry for my ignorance. This is my first competition on this platform

Discussion 5 answers

I was also asking myself the same question but i guess the answer is: yes it will be a rule violation .

11 Oct 2020, 05:25
Upvotes 0
User avatar
Insat

The rules aren't that clear! I think that training on other public audio datasets just to get a pretrained model for this task is okay because pretrained weights like imagenet are used. But, merging a public dataset with this dataset to train your final model results in disqualification. Am I correct ??

11 Oct 2020, 11:37
Upvotes 0

But may be you should share the pretrained weights with evereyone. Like open source it.

User avatar
Insat

Zindi needs to confirm. I am not sure.

User avatar
ZINDI

With image classification, models pre-trained on imagenet are somewhat of a standard, and often built into popular libraries. For audio there isn't an exact equivalent.

Obviously, we don't want a situation where someone wins because of access to something the other participants didn't have. So in general sourcing, an extra dataset (even a public one) and using that to get an edge would be a potential issue. But if you have a dataset (or even better a pretrained model) in mind that you think would help all entrants, and it's public+free, let us know and we can see about adding it as an allowed source.