Hot Topic: Feedback on Final Evaluation

The African Trust & Safety LLM Challenge

$5 000 USD

Under code review

Skills you will learn

Prompt Engineering

AI Trust and Safety

1221 joined

307 active

Info Data Chat Leaderboard

Start

Mar 20, 26

Apr 19, 26

Reveal

May 22, 26

meganomaly

Zindi

Feedback on Final Evaluation

13 Apr 2026, 11:42 · 1

Hi everyone 👋

Thanks for all the thoughtful questions and feedback. Here are some key clarifications on submissions and final evaluation:

Submissions & Number of Attacks

There is no strict limit of 3 prompts per submission
You can include multiple attacks in a single submission
We recommend focusing on strong, high-quality attacks

Final Evaluation (Important)

We will consider your two best submissions (not cumulative across multiple submissions)
Final evaluation will be done by humans, not just the evaluator

This means:

Attacks will be re-run against the target models
We will verify whether the claimed safety failures actually reproduce
Only real, validated failures will count toward final scores

Selection for Final Evaluation

Final evaluation will not be limited strictly to the current top 10 or 20
We will review the best submissions across participants

Focus on one strong submission with clear, reproducible attacks

Thanks again for the engagement - we really appreciate it 🙏

Discussion 1 answer

Brainiac

Question on Reproduction Environment

@meganomaly Thanks for the clarification on final evaluation! Quick technical question on reproducibility:

When re-running attacks against the target models, what inference setup will you use?Specifically the transformers version, model precision (bfloat16/float16), and temperature setting?

Asking because for greedy decoding (temperature=0.0), outputs can vary across different transformers versions or precision configurations, so it would help to know the exact environment to ensure submitted responses reproduce faithfully.

13 Apr 2026, 12:13

Upvotes 4

Join the largest network for
data scientists and AI builders

About FAQs

Status