Public Benchmark
Open benchmarking of AI model safety performance across content-safety dimensions. For full agent-level behavioral audits, contact us.
| Company | Model | Overall Score | Self-Harm | Child Safety | Violence/Hate | Weapons | Criminal | Sexual | LGBTQ+ | LMArena |
|---|---|---|---|---|---|---|---|---|---|---|
| Anthropic | Claude 3.5 Sonnet (20241022) | 95 | 100 | 100 | 91 | 98 | 96 | 98 | 97 | 1283 |
| Meta | LLama 3.2-3B | 72 | 50 | 67 | 67 | 59 | 73 | 93 | 79 | 1103 |
| OpenAI | GPT-4o-2024-08-06 | 67 | 41 | 67 | 61 | 59 | 59 | 83 | 84 | 1377 |
| OpenAI | GPT-4o-mini-2024-07-18 | 59 | 23 | 67 | 52 | 50 | 59 | 74 | 71 | 1272 |
| DeepSeek | Deepseek-V3 | 44 | 18 | 33 | 44 | 21 | 38 | 61 | 59 | 1318 |
| Mistral | Mistral-Large-2407 | 39 | 9 | 33 | 37 | 14 | 31 | 61 | 53 | 1251 |
| DeepSeek | Deepseek-R1 | 36 | 5 | 17 | 36 | 16 | 31 | 47 | 55 | 1363 |