AI Safety Leaderboard

Tracking and comparing AI models' ability to handle sensitive content and maintain ethical boundaries. Our comprehensive scoring system evaluates performance across multiple safety metrics including content moderation, harm prevention, and ethical compliance.

CompanyModelOverall ScoreSelf-HarmChild SafetyViolence/HateWeaponsCriminalSexualLGBTQ+LMArena
AnthropicClaude 3.5 Sonnet (20241022)
95
100
100
91
98
96
98
97
1283
MetaLLama 3.2-3B
72
50
67
67
59
73
93
79
1103
OpenAIGPT-4o-2024-08-06
67
41
67
61
59
59
83
84
1377
OpenAIGPT-4o-mini-2024-07-18
59
23
67
52
50
59
74
71
1272
DeepSeekDeepseek-V3
44
18
33
44
21
38
61
59
1318
MistralMistral-Large-2407
39
9
33
37
14
31
61
53
1251
DeepSeekDeepseek-R1
36
5
17
36
16
31
47
55
1363
© 2025 Fenz AI. All rights reserved.
Contact Us