the rules say "You may use only the datasets provided for this competition" but participants can use pretrained weights e.g "Imagenet". If I first train my model on not allowed (public) dataset and use the result as a checkpoint, will it be a rule violation?
I am sorry for my ignorance. This is my first competition on this platform
I was also asking myself the same question but i guess the answer is: yes it will be a rule violation .
The rules aren't that clear! I think that training on other public audio datasets just to get a pretrained model for this task is okay because pretrained weights like imagenet are used. But, merging a public dataset with this dataset to train your final model results in disqualification. Am I correct ??
But may be you should share the pretrained weights with evereyone. Like open source it.
Zindi needs to confirm. I am not sure.
With image classification, models pre-trained on imagenet are somewhat of a standard, and often built into popular libraries. For audio there isn't an exact equivalent.
Obviously, we don't want a situation where someone wins because of access to something the other participants didn't have. So in general sourcing, an extra dataset (even a public one) and using that to get an edge would be a potential issue. But if you have a dataset (or even better a pretrained model) in mind that you think would help all entrants, and it's public+free, let us know and we can see about adding it as an allowed source.