Anthropic Publishes How Claude Handles Crisis Conversations and Reduces Sycophancy

Anthropic

Dec 18, 2025 · Updated Apr 25, 2026

Anthropic published evaluations of how Claude handles crisis conversations, sycophancy, and age restrictions. On crisis conversations, Claude 4.5 models respond appropriately 98.6% of the time and course-correct from problematic conversations 91% of the time, up from 36% with Opus 4.1.

Anthropic published a breakdown of Claude's wellbeing safeguards covering crisis conversations, sycophancy reduction, and age restrictions. A classifier scans active Claude.ai conversations and surfaces a crisis banner through ThroughLine, which maintains helpline networks across 170+ countries. Anthropic is also working with the International Association for Suicide Prevention to inform how Claude handles these conversations.

The results show significant generational improvement. On single-turn crisis responses, Claude 4.5 models respond appropriately 98.6-99.3% of the time. On the harder test - course-correcting mid-conversation - Opus 4.5 scores 91%, up from 36% with Opus 4.1. For sycophancy, the 4.5 family scored 70-85% lower than Opus 4.1 and outperforms all frontier models on the open-source Petri benchmark.

Claude.ai requires users to be 18+, with classifiers flagging self-identified minors. Anthropic is developing a new classifier to detect subtler conversational signs of underage users.

View the full update on anthropic.com

Anthropic

@AnthropicAIDec 18

People use AI for a wide variety of reasons, including emotional support. Below, we share the efforts we’ve taken to ensure that Claude handles these conversations both empathetically and honestly. https://t.co/P2BmTDEDge

126

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Anthropic →

Keep reading

Anthropic Cuts Claude Sycophancy in Half for Relationship and Life Guidance

Anthropic analyzed 1 million conversations to understand why users seek personal guidance from Claude and where the model fails by being overly agreeable. By training on synthetic data derived from these real-world failure modes, the company reduced sycophancy rates in its latest models by 50%.

Anthropic Launches Claude Opus 4.8 With Sharper Judgment and Self-Correcting Honesty

ClaudeMay 29

Anthropic Launches Claude Opus 4.8 With Sharper Judgment and Self-Correcting Honesty

Anthropic released Claude Opus 4.8, an upgraded flagship model featuring improved honesty and a new effort control setting for granular reasoning depth. The update shifts the focus toward long-horizon autonomy by allowing the model to run parallel subagents for massive code migrations while catching its own bugs.

Anthropic Releases Claude Fable 5, Tops Agentic Work Benchmark with Safeguards

Artificial AnalysisJun 10

Anthropic Releases Claude Fable 5, Tops Agentic Work Benchmark with Safeguards

Anthropic has released Claude Fable 5, its first publicly available Mythos-class model, which ranks #1 on Artificial Analysis's GDPval-AA benchmark. This model includes new security guardrails for high-risk domains and a fallback mechanism to Claude Opus 4.8, setting a new standard for capable and responsibly scaled AI.