The data consists of mp3 files with unique IDs as file names, split into train and test sets and available as zip files in the downloads section. The labels for the training set are contained in train.csv, corresponding to one of the 40 species of bird listed below. Your task is to predict the labels for the test set, following the format in sample_submission.csv.
In cases where more than one species is calling (many recordings contain faint background noise) the labels correspond to the most prominent call, and your predictions should do likewise.
We are grateful to the many citizen scientists and researchers who shared the recordings which made this competition possible. The full list of authors can be found in authors.csv.
Visualizations of some of the bird sounds you will encounter in this challenge.
Some of these recordings are under a Creative Commons Attribution-NonCommercial-NoDerivs 2.5 license, meaning that you cannot sell or distribute modified copies of the calls. If you would like to share example calls, please download them directly from xeno-canto and give proper attribution to the author.
List of potential species:
Join the largest network for
data scientists and AI builders