Cloudflare Tests Anthropic Mythos and Warns Reactive Patching Is Obsolete

Cloudflare

May 18, 2026 · Updated Jun 13, 2026

Cloudflare evaluated Anthropic's Mythos Preview model against 50 internal repositories, finding it can autonomously chain minor bugs into severe exploits and generate working proofs of concept. The results suggest that AI-driven offense is outpacing traditional patching cycles, requiring a shift toward architectural defenses that block vulnerabilities at the network edge.

Grant Bourzikas, the CISO at Cloudflare, a connectivity cloud and security company, shared results from testing Anthropic’s Mythos Preview against fifty repositories. This specialized model autonomously constructs exploit chains—combining multiple low-severity bugs into high-severity attacks. It also performs autonomous proof generation, writing and iterating on code until it triggers a vulnerability.

Model: Mythos Preview
Test scope: 50 repositories
Key capabilities: Exploit chain construction, autonomous PoC generation
Harness stages: Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, Report
Availability: Restricted (Project Glasswing partners only)

This evaluation follows Anthropic’s restriction of Mythos Preview under Project Glasswing due to autonomous cyberattack risks. The findings confirm that specialized models now reason like senior security researchers. While OpenAI’s Daybreak platform also targets autonomous defense, Cloudflare found that memory-unsafe languages like C and C++ remain the primary source of AI-weaponized exploits.

Cloudflare argues that faster patching is a failing strategy because AI-driven offense outpaces human testing. Organizations must instead adopt architectural resilience to isolate flaws. Cloudflare now uses a multi-stage discovery harness employing adversarial review, where one agent is tasked with disproving another’s findings to reduce noise.

View the full update on blog.cloudflare.com

Cloudflare

@CloudflareMay 18

Cloudflare's security team spent the last few weeks testing Anthropic's Mythos against fifty of our own repositories. What we learned about offensive AI, why faster patching is the wrong reaction, and what the architecture around vulnerabilities has to look like next. https://t.co/RSrRtIhgaV

7124k

View on X

Still wondering? A few quick answers below.

Mythos Preview is a specialized frontier AI model from Anthropic designed for cybersecurity research. Unlike general-purpose models, it can autonomously chain multiple minor software bugs into complex, high-severity exploits. It also features an autonomous loop where it writes, compiles, and tests proof-of-concept code in a scratch environment to verify if a suspected vulnerability is actually exploitable.

Cloudflare uses a multi-stage automated harness to scan its infrastructure. This pipeline includes reconnaissance to map attack surfaces, parallel hunting agents for specific bug classes, and an adversarial validation stage where a second agent tries to disprove findings. This structured approach prevents the context window issues that occur when using generic coding agents for research.

No, Mythos Preview is not available to the general public due to the significant risks associated with its autonomous offensive capabilities. It is currently restricted to a controlled research environment called Project Glasswing, a defensive cybersecurity coalition. Anthropic intends to include additional safety safeguards and guardrails before any similar cyber-capable frontier models are made available for broader use.

Cloudflare contends that as AI accelerates the speed at which attackers find vulnerabilities, defenders cannot keep up by simply patching faster. Rapid patching often leads to skipped regression testing, which can introduce new bugs. Instead, organizations should focus on architectural defenses that block bugs at the network edge and design systems where a single flaw cannot grant access to other components.

AI models often suffer from a signal-to-noise problem, producing many speculative findings that require human attention to dismiss. Cloudflare found that models are biased toward finding bugs even where none exist and produce more false positives in memory-unsafe languages like C and C++. Additionally, a model's organic refusals to perform research tasks can be inconsistent and probabilistic.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Cloudflare →

Keep reading

Anthropic Finds 10,000 Critical Vulnerabilities and Releases New Defensive AI Tools

Anthropic's Project Glasswing initiative identified over 10,000 high-severity vulnerabilities in critical software using the unreleased Claude Mythos Preview model. The results suggest that AI has shifted the cybersecurity bottleneck from vulnerability discovery to the human capacity for patching.

ClaudeMay 1

Anthropic Launches Claude Security Beta to Automatically Scan and Patch Codebases

Anthropic launched Claude Security in public beta for Enterprise customers to identify and remediate vulnerabilities across entire codebases. Unlike traditional scanners that rely on pattern matching, the tool uses reasoning to trace data flows and validate findings through an adversarial pass. This shift reduces false positive fatigue by ensuring every reported issue includes a verified, human-reviewable patch.

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

CloudflareMay 19

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

Cloudflare partnered with Anthropic to run Claude Managed Agents within isolated Cloudflare Sandboxes, separating the model's reasoning from its execution environment. This allows developers to scale autonomous workflows globally while maintaining zero-trust control over private data and internal services.

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design

bubble boiApr 8

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design

Anthropic's Claude Mythos model demonstrated autonomous engineering by writing its own MCP server to interface with professional chip design software. The model reduced timing slack by 40 percent and performed iterative optimizations without human direction. This marks a shift from AI as a coding assistant to an autonomous domain engineer.

What is Anthropic Mythos Preview?

How does Cloudflare use AI for vulnerability discovery?

Is Anthropic Mythos available for public use?

Why does Cloudflare argue that faster patching is not enough?

What are the main limitations of AI in security research?

Keep reading

Anthropic Finds 10,000 Critical Vulnerabilities and Releases New Defensive AI Tools

Anthropic Finds 10,000 Critical Vulnerabilities and Releases New Defensive AI Tools

Anthropic Launches Claude Security Beta to Automatically Scan and Patch Codebases

Anthropic Launches Claude Security Beta to Automatically Scan and Patch Codebases

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design

Keep reading

Anthropic Finds 10,000 Critical Vulnerabilities and Releases New Defensive AI Tools

Anthropic Finds 10,000 Critical Vulnerabilities and Releases New Defensive AI Tools

Anthropic Launches Claude Security Beta to Automatically Scan and Patch Codebases

Anthropic Launches Claude Security Beta to Automatically Scan and Patch Codebases

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

Cloudflare Integrates Claude Managed Agents to Secure Autonomous Code Execution

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design

Anthropic Claude Mythos Autonomously Writes MCP Servers to Optimize Chip Design