🌍 Hot Topic: Potential bug in the scoring s...

Lelapa AI Buzuzu-Mavi Challenge

Helping Africa

$1 300 USD

Completed (over 1 year ago)

Skills you will learn

Natural Language Processing

Sentiment Analysis

Machine Translation

496 joined

118 active

Info Data Chat Leaderboard

Start

Jan 09, 25

Apr 06, 25

Reveal

Apr 07, 25

marching_learning

Nostalgic Mathematics

Potential bug in the scoring script [TO BE CONFIRMED]

Platform · 24 Feb 2025, 22:54 · 10

Hi every one, I hope you're enjoying this challenge. But It seems that there is a bug in the scoring script.

I went through the eval script. (https://github.com/Lelapa-AI/zindi-inkuba-notebook/blob/main/utils/eval.py) And I notice two bugs:

1/ As it is, it only scores the translation, all others (NLI and sentiement analysis) will always be scored zero because of type mismatch. In the script, the groudth truth is maped via a dictionnary, but the original prediction will remain string so that the F1 score is always 0. The score will always be zero despite a right prediction.

For this bug we may replace the line 26 of the script by :

predicted_label = int(row["Response"])

2/ Again , the weights are not equal by task, the weight of machine translation is 300/302 and the others have weights 1/302 while the expected weight by task is 1/3. In the code, the 300 chrf scores are gathered for the translation part, then we append the f1 score of NLI and sentiment analysis. To solve it, juste reiniitialize the list scores of line 10 by its own mean before the line 44.

@Zindi @Amy_Bray

Discussion 10 answers

snow

Great find! Now I get why your score is not 0.9 :P

24 Feb 2025, 23:08

Upvotes 1

marching_learning

Nostalgic Mathematics

No @snow 🙏. It means that my score should be greather than 0.1. For the time being I only worked on translation and sentiment. My Local CV for translation is 0.32 and 0.45 for sentiment. So I expected a score around (0.32 + 0.45)/3 = 0.25xxxx.

replied to snow25 Feb 2025, 03:36

Upvotes 0