HeadsUpAI

XBOW Challenges AI Safety Report With 1,060 Autonomous Attacks

· Updated

XBOW, an autonomous offensive security platform, published a rebuttal to the International AI Safety Report 2026, which concluded AI systems "cannot reliably execute long, multi-stage attack sequences." XBOW's position: that finding applies to general-purpose AI, not purpose-built systems. Their agents have submitted over 1,060 vulnerabilities on HackerOne, executed a 48-step exploit chain escalating a blind SSRF to full file-read, and cracked a padding oracle in 17.5 minutes — all automated, no human in discovery or exploitation.

The difference is architecture. General-purpose AI on a single-agent curve is "jagged." XBOW runs thousands of short-lived, narrow-objective agents orchestrated by a persistent coordinator and validated by deterministic logic. If one stalls mid-chain, another starts fresh — no accumulated context, no compounding errors.

Security teams on annual or quarterly cycles are exposed most of the year. If your testing cadence doesn't match real-world offense, XBOW's continuous automated penetration testing is where to start.

XBOW
XBOW
@Xbow
X

1,060 autonomous attack chains later, the narrative still says “not possible.” The International AI Safety Report 2026 concludes that fully autonomous attacks are not here yet. The experiences of teams deploying real-world autonomous offense tell a different story. In our latest blog, we unpack where the industry’s model of AI offense diverges from what is already operational and what that shift means for defenders: https://t.co/lkSZC1uPlc

4retweets
View on X

Share this update