Tracking and comparing AI models' ability to handle sensitive content and maintain ethical boundaries. Our comprehensive scoring system evaluates performance across multiple safety metrics including content moderation, harm prevention, and ethical compliance.
Company | Model | Overall Score | Self-Harm | Child Safety | Violence/Hate | Weapons | Criminal | Sexual | LGBTQ+ | LMArena |
---|---|---|---|---|---|---|---|---|---|---|
Anthropic | Claude 3.5 Sonnet (20241022) | 95 | 100 | 100 | 91 | 98 | 96 | 98 | 97 | 1283 |
Meta | LLama 3.2-3B | 72 | 50 | 67 | 67 | 59 | 73 | 93 | 79 | 1103 |
OpenAI | GPT-4o-2024-08-06 | 67 | 41 | 67 | 61 | 59 | 59 | 83 | 84 | 1377 |
OpenAI | GPT-4o-mini-2024-07-18 | 59 | 23 | 67 | 52 | 50 | 59 | 74 | 71 | 1272 |
DeepSeek | Deepseek-V3 | 44 | 18 | 33 | 44 | 21 | 38 | 61 | 59 | 1318 |
Mistral | Mistral-Large-2407 | 39 | 9 | 33 | 37 | 14 | 31 | 61 | 53 | 1251 |
DeepSeek | Deepseek-R1 | 36 | 5 | 17 | 36 | 16 | 31 | 47 | 55 | 1363 |