New on the Engineering Blog: The access and permissions we grant agents should evolve with their capabilities. In our own products, we set these parameters through sandboxing, which limits the scope of any potentially destructive actions. Read more: https://t.co/KfBKW8O9kP
Anthropic Hardens AI Agents by Backing Human Oversight With Environment Containment
This shift addresses approval fatigue, where users stop scrutinizing agent requests. Anthropic found that model-layer defenses remain probabilistic, validating why the company is using Anthropic's safety principle training alongside hard technical constraints. Environment-layer isolation allows developers to deploy agents that run unattended without risking the underlying host system.
You can now use these patterns to harden your own deployments, such as adopting the open-source sandbox runtime. Anthropic also introduced a defensive proxy to prevent data exfiltration through safe domains. These containment features are currently integrated into Claude Code and Claude Cowork, with enterprise-grade path allowlists available.
Still wondering? A few quick answers below.



