HeadsUpAI

Cloudflare Tests Anthropic Mythos and Warns Reactive Patching Is Obsolete

Grant Bourzikas, the CISO at Cloudflare, a connectivity cloud and security company, shared results from testing Anthropic’s Mythos Preview against fifty repositories. This specialized model autonomously constructs exploit chains—combining multiple low-severity bugs into high-severity attacks. It also performs autonomous proof generation, writing and iterating on code until it triggers a vulnerability.
Model
Mythos Preview
Test scope
50 repositories
Key capabilities
Exploit chain construction, autonomous PoC generation
Harness stages
Recon, Hunt, Validate, Gapfill, Dedupe, Trace, Feedback, Report
Availability
Restricted (Project Glasswing partners only)

This evaluation follows Anthropic’s restriction of Mythos Preview under Project Glasswing due to autonomous cyberattack risks. The findings confirm that specialized models now reason like senior security researchers. While OpenAI’s Daybreak platform also targets autonomous defense, Cloudflare found that memory-unsafe languages like C and C++ remain the primary source of AI-weaponized exploits.

Cloudflare argues that faster patching is a failing strategy because AI-driven offense outpaces human testing. Organizations must instead adopt architectural resilience to isolate flaws. Cloudflare now uses a multi-stage discovery harness employing adversarial review, where one agent is tasked with disproving another’s findings to reduce noise.

Cloudflare
Cloudflare
@Cloudflare
X

Cloudflare's security team spent the last few weeks testing Anthropic's Mythos against fifty of our own repositories. What we learned about offensive AI, why faster patching is the wrong reaction, and what the architecture around vulnerabilities has to look like next. https://t.co/RSrRtIhgaV

544retweets3.1klikes
View on X

Still wondering? A few quick answers below.

Mythos Preview is a specialized frontier AI model from Anthropic designed for cybersecurity research. Unlike general-purpose models, it can autonomously chain multiple minor software bugs into complex, high-severity exploits. It also features an autonomous loop where it writes, compiles, and tests proof-of-concept code in a scratch environment to verify if a suspected vulnerability is actually exploitable.

Cloudflare uses a multi-stage automated harness to scan its infrastructure. This pipeline includes reconnaissance to map attack surfaces, parallel hunting agents for specific bug classes, and an adversarial validation stage where a second agent tries to disprove findings. This structured approach prevents the context window issues that occur when using generic coding agents for research.

No, Mythos Preview is not available to the general public due to the significant risks associated with its autonomous offensive capabilities. It is currently restricted to a controlled research environment called Project Glasswing, a defensive cybersecurity coalition. Anthropic intends to include additional safety safeguards and guardrails before any similar cyber-capable frontier models are made available for broader use.

Cloudflare contends that as AI accelerates the speed at which attackers find vulnerabilities, defenders cannot keep up by simply patching faster. Rapid patching often leads to skipped regression testing, which can introduce new bugs. Instead, organizations should focus on architectural defenses that block bugs at the network edge and design systems where a single flaw cannot grant access to other components.

AI models often suffer from a signal-to-noise problem, producing many speculative findings that require human attention to dismiss. Cloudflare found that models are biased toward finding bugs even where none exist and produce more false positives in memory-unsafe languages like C and C++. Additionally, a model's organic refusals to perform research tasks can be inconsistent and probabilistic.

Share this update