I threw together a quick starter notebook: https://colab.research.google.com/drive/1SdrwQtT16FAGmfssJi5riMjDOQU7IvWE
It uses fastai, treating this as essentially an image classification task. As you can imagine, this isn't the optimal solution! That said, it's possible to get some pretty good performance with this method.
It scores ~2.7. Accuracy is only ~32%, but it gets the right bird in the top 5 guesses >60% of the time.
Hoping to see some different techniques springing up (please share what you're trying!!) but this at least will show an example path of loading the data, extracting features (spectrograms in this case), training a model and making predictions.
Good luck all :)
Great notebook.
The "Get Creative" part at the end got me pretty excited. I'm going to try some crazy ideas.
Sweet, thanks for the starter code! It's pretty creative to use the spectrogram and to turn this into an image classification problem, but as you say it is probably not optimal.
For those who want to explore the topics of audio and signal processing further, this is a handy guide that goes from noob to pro pretty quickly. To clarify, the noob parts are suitable for anyone who is interested in how audio is processed digitally. This is general knowledge. The pro parts of this post involve more intense Python, so if you don't want to use Python, then you may safely skip that part.
https://www.analyticsvidhya.com/blog/2017/08/audio-voice-processing-deep-learning/