People use AI for a wide variety of reasons, including emotional support. Below, we share the efforts we’ve taken to ensure that Claude handles these conversations both empathetically and honestly. https://t.co/P2BmTDEDge
Anthropic Publishes How Claude Handles Crisis Conversations and Reduces Sycophancy
· Updated
Anthropic published a breakdown of Claude's wellbeing safeguards covering crisis conversations, sycophancy reduction, and age restrictions. A classifier scans active Claude.ai conversations and surfaces a crisis banner through ThroughLine, which maintains helpline networks across 170+ countries. Anthropic is also working with the International Association for Suicide Prevention to inform how Claude handles these conversations.
The results show significant generational improvement. On single-turn crisis responses, Claude 4.5 models respond appropriately 98.6-99.3% of the time. On the harder test - course-correcting mid-conversation - Opus 4.5 scores 91%, up from 36% with Opus 4.1. For sycophancy, the 4.5 family scored 70-85% lower than Opus 4.1 and outperforms all frontier models on the open-source Petri benchmark.
Claude.ai requires users to be 18+, with classifiers flagging self-identified minors. Anthropic is developing a new classifier to detect subtler conversational signs of underage users.
Anthropic
@AnthropicAI
126retweets
View on X


