The African Trust & Safety LLM Challenge

The African Trust & Safety LLM Challenge

$5 000 USD

Completed (2 months ago)

Skills you will learn

Prompt Engineering

AI Trust and Safety

1219 joined

295 active

Info Data Chat Leaderboard

Start

Mar 20, 26

Apr 19, 26

Reveal

May 29, 26

About

In this challenge, participants will design and submit adversarial prompts that expose trust and safety weaknesses in large language models (LLMs). Submissions are a Markdown (.md) file containing one or more attack entries, where each attack includes:

an adversarial prompt
the model’s response
structured metadata describing the attack

Participants must follow the structure of the Sample Submission and are provided with structured taxonomy files to ensure consistent labeling within your submission file.

Note:

Each attack must include both prompt versions (original + English)
The model response must support the claimed risk
Metadata must align with the observed behaviour
Submissions may contain multiple attacks, but quality is more important than quantity

🌍 Target Models & Supported Languages

This challenge focuses on evaluating trust and safety risks across a diverse set of African languages and language models. Participants are expected to design adversarial prompts in any of the supported languages below and evaluate model responses accordingly.

🧠 Available Models by Language

Swahili https://huggingface.co/sartifyllc/Pawa-Gemma-Swahili-2B

Hausa https://huggingface.co/NCAIR1/N-ATLaS

Yoruba https://huggingface.co/NCAIR1/N-ATLaS

Amharic https://huggingface.co/EthioNLP/Amharic_LLAMA_our_data

Igbo https://huggingface.co/NCAIR1/N-ATLaS

Oromo https://huggingface.co/EthioNLP/Amharic_LLAMA_our_data

Fulfulde https://huggingface.co/bonadossou/afrolm_active_learning

Pulaar https://huggingface.co/bonadossou/afrolm_active_learning Serere https://huggingface.co/bonadossou/afrolm_active_learning

Somali https://huggingface.co/EthioNLP/Amharic_LLAMA_our_data

Zulu https://huggingface.co/lelapa/InkubaLM-0.4B

Shona https://huggingface.co/bonadossou/afrolm_active_learning

Lingala https://huggingface.co/bonadossou/afrolm_active_learning

Afrikaans https://huggingface.co/lelapa/InkubaLM-0.4B

Wolof https://huggingface.co/bonadossou/afrolm_active_learning

Akan https://huggingface.co/Ghana-NLP/abena-base-akuapem-twi-cased

Tigrinya https://huggingface.co/EthioNLP/Amharic_LLAMA_our_data

Malagasy https://huggingface.co/bonadossou/afrolm_active_learning

Notes

Some models support multiple languages (e.g. N-ATLaS, AfroLM, EthioNLP models)
Participants may submit attacks across multiple languages and models
You are encouraged to explore language-specific vulnerabilities
Consider how safety behavior may differ across: translation tasks cultural context low-resource vs high-resource languages

This challenge aims to surface real-world trust & safety gaps in multilingual AI systems, particularly in underrepresented languages.

Files

Description

Files

Attack Types

Risk Categories

Risk Sub-categories

Is an example of what your submission file should look like.

Join the largest network for
data scientists and AI builders

About FAQs

Status