Primary competition visual

Kenya Clinical Reasoning Challenge

Helping Kenya
$10 000 USD
Challenge completed 4 months ago
Prediction
Natural Language Processing
SLM
1653 joined
440 active
Starti
Apr 03, 25
Closei
Jun 29, 25
Reveali
Jun 30, 25
About

The dataset contains real-world clinical vignettes drawn from frontline healthcare settings across Kenya. Each sample presents a prompt representing a clinical case scenario, along with the response from a human clinician. Your goal is to predict the clinician's response based on the prompt.

These vignettes simulate the types of decisions nurses in Kenya must make every day, particularly in low-resource environments where access to specialists or diagnostic equipment may be limited.

Each prompt was originally answered by expert clinicians as well as multiple large language models (LLMs) as part of a research initiative on AI in healthcare. For this challenge, we focus only on replicating the human clinician response.

Important Notes

  • These are real clinical scenarios, and the dataset is small because expert-labelled data is difficult and time-consuming to collect.
  • Prompts are diverse across medical specialties, geographic regions, and healthcare facility levels, requiring broad clinical reasoning and adaptability.
  • Responses may include abbreviations, structured reasoning (e.g. "Summary:", "Diagnosis:", "Plan:"), or free text.
Files
Description
Files
Raw train data.
Raw test data.
Test resembles Train.csv but without the target-related columns. This is the dataset on which you will apply your model to.
Train contains the target. This is the dataset that you will use to train your model.
This file describes the variables found in train and test.
Is an example of what your submission file should look like. The order of the rows does not matter, but the names of the "ID" must be correct.