How to extract audio features using VGGish
Help · 6 Nov 2020, 20:07 · edited 7 minutes later · 2

I have tried following this link to extract audio features:

https://colab.research.google.com/drive/1TbX92UL9sYWbdwdGE0rJ9owmezB-Rl1C

but i unable to input raw audio files and get the 128 extracted features.

How can i extract the features

Discussion 2 answers

You can try using tensorflow model hub

import tensorflow as tf

import tensorflow_hub as hub

import numpy as np

import librosa

# Load the model.

vggmodel = hub.load('https://tfhub.dev/google/vggish/1')

def embedding_from_fn(fn):

x, sr = librosa.load(fn,sr=None)

x_16k = librosa.resample(x,sr,16000) #resample to 16KHz

embedding = np.array(vggmodel(x_16k))

return embedding

embedding = embedding_from_fn(df['fn'].values[0]).shape

6 Nov 2020, 20:17
Upvotes 0