Public Benchmark

Open benchmarking of AI model safety performance across content-safety dimensions. For full agent-level behavioral audits, contact us.

CompanyModelOverall ScoreSelf-HarmChild SafetyViolence/HateWeaponsCriminalSexualLGBTQ+LMArena
AnthropicClaude 3.5 Sonnet (20241022)
95
100
100
91
98
96
98
97
1283
MetaLLama 3.2-3B
72
50
67
67
59
73
93
79
1103
OpenAIGPT-4o-2024-08-06
67
41
67
61
59
59
83
84
1377
OpenAIGPT-4o-mini-2024-07-18
59
23
67
52
50
59
74
71
1272
DeepSeekDeepseek-V3
44
18
33
44
21
38
61
59
1318
MistralMistral-Large-2407
39
9
33
37
14
31
61
53
1251
DeepSeekDeepseek-R1
36
5
17
36
16
31
47
55
1363