This solution needs to be deployed on edge devices. This means we have some interesting resource restrictions to ensure this model is usable.
- You may only use 1 CPU such as the ARM Cortex-A53 or similar.
- No GPU or TPU support is allowed
- Your maximum model size needs to be 10MB or less.
- You can only train for a maximum of 6 hours.
- Inference time needs to be 2 minutes or less. This is to simulate a ~50ms inference time which is needed for real-time edge applications.
- You are not allowed to use pretrained models.
We can't wait to see you on the leaderboard!
I don't understand the language; it's too difficult for me to label audio data. How did you label the data to determine if the first audio corresponds to a hello or not.
Hi @balla I don't think that you have to do it by yourself. The data is already labeled for you. When you load the train data you have the label in the class column. But in Test df there is no class column. And it makes sense. This is what you want to predict.
please write clearer. CPU is only for inference right ? We can train by GPU right ?. What do you mean not allowing to use pretrained models ? Can I fine-tuned, say, a public huggingface model or Whisper from OpenAI ?
For this statement - "Inference time needs to be 2 minutes or less. This is to simulate a ~50ms inference time which is needed for real-time edge applications."
Does it mean that the inference includes feature engineering as well? If so, then does it mean for one audio file, it needs to be less than 2 minutes, or for the entirety of test data?
Or
Does it mean the model predictions should come in less than 2 minutes?
@Amy_Bray do help and clarify...
Are we allowed to use finetuned models @Amy_Bray?
Hi everyone,
Thanks for your questions! Let me clarify the resource restrictions for the challenge:
I hope this clears things up! If you have any more questions, feel free to ask. Good luck to everyone, and we look forward to seeing your solutions on the leaderboard!
Thanks @Amy_Bray for the clarification.
Can I use open-source model from HuggingFace for feature extraction and use that feature to train my own model from scratch ?