Hello fellow contestants, representing team LMrab3in, this is the second place solution for this competition:
Second Place Source Code Solution
Our overall approach:
- Extract mel features from audio.
- Train using using imagenet pretrained models with 10 stratified folds and blending fold results with gmean.
- blended all the models using gmean.
What worked/didn't work for us:
- reducing hop length improved hugely the result for all of us.
- Using per-channel energy normalization (PCEN) seemed awful in the public lb but it gave good results in the private lb and it boosted the blend results overall.
- Working with low batch size (4 -> 10) seemed to improve the results
- Since we used imagenet pretrained models we had 3 approach to turn mel spectogram to an image:
1.Stack the image with its derivatives (delta order 1 and delta order 2)
2.Create a conv layer before the pretrained models (input channel = 1 output channel = 3)
3.Change the pretrained model first conv layer
All these approaches worked with us.
- deeper models (renest269 resnext101) helped achieve better results.
- Z-score normalization (Standardization) and min max scaling * 255 helped the models converge faster.
- Training with ReduceLROnPlateau scheduler with low patience and a min_lr value helped improve the result of some models.
- Use after train approach to retrain the trained models on low learning rates with CosineAnnealingLR scheduler improved the result of some models.
- TTA improved the result of only few models
- Stacking the models made the results worse so we dropped it.
- Every member of the team used different kind of data augmentations (or none). Spec mix seemed to improve the results for some models.
Things there we were planning on doing but didn't find time to:
- Mixup had some potential to help in the blend, but since its results were a bit off and we already have too many models, we dropped it. Experimenting more with mixup would potentially improve the results.
- Pseudo-labeling deserved some experiments because of the small dataset and strong single models.
Thanks a lot for sharing
You are blessed
awesome stuffs ... congrats !
Congratulations and thank you for sharing!