How to extract audio features using VGGish
published 6 Nov 2020, 20:07
edited 7 minutes later

I have tried following this link to extract audio features:

but i unable to input raw audio files and get the 128 extracted features.

How can i extract the features

You can try using tensorflow model hub

import tensorflow as tf

import tensorflow_hub as hub

import numpy as np

import librosa

# Load the model.

vggmodel = hub.load('')

def embedding_from_fn(fn):

x, sr = librosa.load(fn,sr=None)

x_16k = librosa.resample(x,sr,16000) #resample to 16KHz

embedding = np.array(vggmodel(x_16k))

return embedding

embedding = embedding_from_fn(df['fn'].values[0]).shape