Primary competition visual

AI4D Yorùbá Machine Translation Challenge

Helping Nigeria
$2 000 USD
Completed (almost 5 years ago)
Machine Translation
683 joined
84 active
Starti
Dec 04, 20
Closei
May 30, 21
Reveali
May 30, 21
User avatar
Amy_Bray
Zindi
Error metric fixed and leaderboard rescored
Platform · 6 May 2021, 13:04 · 7

Dear competitors,

Thank you for your patience while we worked on the error metric. It proved harder than we initially thought due to the diacritics and different characters.

We have implemented the Rouge Score, reporting the F-measure. This error metric was implemented on 5 May 2021 and the leaderboard rescored.

The Recall-Oriented Understudy for Gisting Evaluation (ROUGE) scoring algorithm calculates the similarity between a candidate document and a collection of reference documents. Use the ROUGE score to evaluate the quality of document translation and summarization models [ref].

Once again, thank you for your patience and perseverance during this challenge.

Discussion 7 answers
User avatar
AkashPB

Hi, Thanks for the update :)

But which ROUGE Score is used - ROUGE-L or ROUGE-1 ??

User avatar
Amy_Bray
Zindi

Hi, it is ROUGE-N (N-gram) scoring (Rouge1), reporting the F-measure.

User avatar
AkashPB

Ok got it !

Is it possible to publish any starter code with this Rouge Score, used for evaluating and model training - for me as beginner is not clear how to use it?

7 May 2021, 10:54
Upvotes 0
User avatar
flamethrower

Hello @Zindi, are punctuations important for the translation, are they taken into account for the ROUGE score?

Update - I see ROUGE Score ignores the punctuations when I try the python metric. Thanks

User avatar
Amy_Bray
Zindi

Yes, the diacritics and accents are taken into account.

User avatar
flamethrower

Thank you for the clarification