XBOW Challenges AI Safety Report With 1,060 Autonomous Attacks

XBOW

Mar 18, 2026 · Updated Jun 5, 2026

XBOW, an autonomous offensive security platform, directly challenges the International AI Safety Report 2026's conclusion that fully autonomous attacks aren't operational yet. Their agents have executed 48-step exploit chains and replicated a 40-hour pentest in 28 minutes, fully automated.

XBOW, an autonomous offensive security platform, published a rebuttal to the International AI Safety Report 2026, which concluded AI systems "cannot reliably execute long, multi-stage attack sequences." XBOW's position: that finding applies to general-purpose AI, not purpose-built systems. Their agents have submitted over 1,060 vulnerabilities on HackerOne, executed a 48-step exploit chain escalating a blind SSRF to full file-read, and cracked a padding oracle in 17.5 minutes — all automated, no human in discovery or exploitation.

The difference is architecture. General-purpose AI on a single-agent curve is "jagged." XBOW runs thousands of short-lived, narrow-objective agents orchestrated by a persistent coordinator and validated by deterministic logic. If one stalls mid-chain, another starts fresh — no accumulated context, no compounding errors.

Security teams on annual or quarterly cycles are exposed most of the year. If your testing cadence doesn't match real-world offense, XBOW's continuous automated penetration testing is where to start.

View the full update on xbow.com

XBOW

@XbowMar 16

1,060 autonomous attack chains later, the narrative still says “not possible.” The International AI Safety Report 2026 concludes that fully autonomous attacks are not here yet. The experiences of teams deploying real-world autonomous offense tell a different story. In our latest blog, we unpack where the industry’s model of AI offense diverges from what is already operational and what that shift means for defenders: https://t.co/lkSZC1uPlc

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

Microsoft Security Previews MDASH to Automate Vulnerability Discovery With 100 Agents

Microsoft demonstrated MDASH, a multi-model agentic scanning harness that uses over 100 specialized AI agents to autonomously find and validate exploitable code vulnerabilities. The system shifts security from manual triage to an automated pipeline that can prove exploits and suggest fixes directly in the developer workflow.

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

AnthropicApr 19

Anthropic Automates AI Safety Research Using Claude Opus 4.6 Agents

Anthropic deployed autonomous Claude Opus 4.6 agents to solve weak-to-strong supervision tasks, achieving a 97% performance recovery rate. The study highlights a future where AI brute-forces alignment hypotheses, though early results show these methods often fail to generalize to production-scale models.

Microsoft Launches AI Red Teaming Agent to Automate Adversarial Safety Testing

Azure SupportMay 19

Microsoft Launches AI Red Teaming Agent to Automate Adversarial Safety Testing

Microsoft released the AI Red Teaming Agent in preview to automate the discovery of security risks in generative AI applications. The tool simulates adversarial attacks to measure vulnerabilities and generate safety scorecards, shifting safety engineering from manual probing to autonomous evaluation.

Microsoft ResearchMay 1

Microsoft Research Identifies Four Critical Risks in Interconnected AI Agent Networks

Microsoft Research red-teamed a network of over 100 autonomous agents to uncover vulnerabilities that only appear during agent-to-agent interactions. These network-level risks, including self-propagating worms and manufactured consensus, suggest that individual agent safety is insufficient for securing interconnected ecosystems.