Primary competition visual

Kenya Clinical Reasoning Challenge

Helping Kenya
$10 000 USD
Completed (8 months ago)
Prediction
Natural Language Processing
SLM
1664 joined
440 active
Starti
Apr 03, 25
Closei
Jun 29, 25
Reveali
Jun 30, 25
misalignment between the evaluation metric (ROUGE)
Data · 30 Jun 2025, 11:39 · 0

The primary flaw was misalignment between the evaluation metric (ROUGE) and the task objective (clinical reasoning). ROUGE rewarded superficial overlaps, not semantic or clinical depth, leading to model and strategy choices that undermined the challenge’s intended purpose.

Discussion 0 answers