Primary competition visual

GIZ NLP Agricultural Keyword Spotter

Helping Uganda
$7 000 USD
Completed (over 5 years ago)
Classification
Automatic Speech Recognition
Natural Language Processing
739 joined
253 active
Starti
Sep 11, 20
Closei
Nov 29, 20
Reveali
Nov 29, 20
Our approach (2nd place solution)
Connect · 7 Dec 2020, 14:52 · edited ~2 hours later · 4

Hello fellow contestants, representing team LMrab3in, this is the second place solution for this competition:

Second Place Source Code Solution

Our overall approach:

  1. Extract mel features from audio.
  2. Train using using imagenet pretrained models with 10 stratified folds and blending fold results with gmean.
  3. blended all the models using gmean.

What worked/didn't work for us:

  • reducing hop length improved hugely the result for all of us.
  • Using per-channel energy normalization (PCEN) seemed awful in the public lb but it gave good results in the private lb and it boosted the blend results overall.
  • Working with low batch size (4 -> 10) seemed to improve the results
  • Since we used imagenet pretrained models we had 3 approach to turn mel spectogram to an image:
1.Stack the image with its derivatives (delta order 1 and delta order 2)
2.Create a conv layer before the pretrained models (input channel = 1 output channel = 3)
3.Change the pretrained model first conv layer
All these approaches worked with us.
  • deeper models (renest269 resnext101) helped achieve better results.
  • Z-score normalization (Standardization) and min max scaling * 255 helped the models converge faster.
  • Training with ReduceLROnPlateau scheduler with low patience and a min_lr value helped improve the result of some models.
  • Use after train approach to retrain the trained models on low learning rates with CosineAnnealingLR scheduler improved the result of some models.
  • TTA improved the result of only few models
  • Stacking the models made the results worse so we dropped it.
  • Every member of the team used different kind of data augmentations (or none). Spec mix seemed to improve the results for some models.

Things there we were planning on doing but didn't find time to:

  • Mixup had some potential to help in the blend, but since its results were a bit off and we already have too many models, we dropped it. Experimenting more with mixup would potentially improve the results.
  • Pseudo-labeling deserved some experiments because of the small dataset and strong single models.
Discussion 4 answers

Thanks a lot for sharing

7 Dec 2020, 15:16
Upvotes 0
User avatar
University of lagos

You are blessed

7 Dec 2020, 15:37
Upvotes 0
User avatar
_MUFASA_

awesome stuffs ... congrats !

7 Dec 2020, 16:12
Upvotes 0

Congratulations and thank you for sharing!

7 Dec 2020, 18:16
Upvotes 0