AI4D Yorùbá Machine Translation Challenge
$2 000 USD
Can you translate Yorùbá to English?
503 data scientists enrolled, 85 on the leaderboard
4 December 2020—30 May 2021
178 days

The training data consist of 10,054 parallel Yorùbá-English sentences from different domains like news, Yorùbá proverbs, movie transcript, ted talks, radio broadcast transcript, localization translation, and books.

Variable definitions

  • English: English sentences
  • Yoruba: Yoruba sentences

Files available for download:

  • Train.csv - contains the target. This is the dataset that you will use to train your model.
  • Test.csv- resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
  • SampleSubmission.csv - shows the submission format for this competition, with the ID column mirroring that of Test.csv and the ‘English column containing your translations. The order of the rows does not matter, but the names of the ID column must be correct.