New Anthropic research: Measuring AI agent autonomy in practice. We analyzed millions of interactions across Claude Code and our API to understand how much autonomy people grant to agents, where they’re deployed, and what risks they may pose. Read more: https://t.co/CllNkMF4ZZ
Anthropic Measures How Agents Actually Gain Autonomy in the Wild
Anthropic· Updated
Anthropic analyzed millions of Claude Code and API interactions to measure real-world agent autonomy. Autonomous session lengths nearly doubled in three months, and software engineering dominates at 50% of agent activity - with healthcare and finance as emerging domains.
Software engineering accounts for ~50% of agentic tool calls, with emerging use in healthcare, finance, and cybersecurity. Most actions remain low-risk and reversible, but risky domain adoption is growing. The central finding: autonomy is co-constructed between model, user, and product - pre-deployment evaluations can't capture it alone, making post-deployment monitoring infrastructure essential.
Anthropic's paper includes recommendations for model developers, product developers, and policymakers on managing autonomy and risk in deployed agents.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

