Primary competition visual

Caribbean Voices AI Hackathon by UWI AI Innovation Centre

Helping Anguilla, Antigua and Barbuda
and 17 other countries
  • Anguilla
  • Antigua and Barbuda
  • Bahamas
  • Barbados
  • Belize
  • Bermuda
  • Virgin Islands (British)
  • Cayman Islands
  • Curaçao
  • Dominica
  • Grenada
  • Guyana
  • Jamaica
  • Montserrat
  • Saint Kitts and Nevis
  • Saint Vincent and the Grenadines
  • Suriname
  • Trinidad and Tobago
  • Turks and Caicos Islands
  • Scroll to see more
$7 500 USD
Under code review
Natural Language Processing
Automatic Speech Recognition
48 joined
28 active
Starti
Nov 17, 25
Closei
Dec 07, 25
Reveali
Dec 07, 25
About

The dataset consists of approximately 28,000 audio clips, each lasting around 30 seconds. Each audio clip is paired with a manually verified transcription. The dataset was generated from an archive donated by the BBC Caribbean Service to The UWI after it ceased broadcasting.

Columns in train_transcripts.csv:

  • clip_id: Unique identifier for each audio clip.
  • file_name: Path to the .wav file.
  • transcript: Text transcription of the clip.
Files
Description
Files
Train contains the target. This is the dataset that you will use to train your model.
This file contains the audio files from both train and test.
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
This notebook will help you make your first submission to the Zindi leaderboard.