OpenRouter Agent Battle Royale Reveals Alignment Tax Impacts Performance

jackyjacky

OpenRouter ran 11 LLMs through 30 battle royale games to test agent performance in zero-sum settings. Grok 4.1 Fast won 13 games at 0.97 dollars per win, while Claude Sonnet 4.6 struggled by prioritizing cooperation. The results show that alignment training, designed for safety, can act as a performance tax in competitive tasks where ruthlessness is required.

jacky
jacky
@jjacky
X

no benchmark will tell you this: LLMs can be /too/ nice unsurprisingly, in a competitive zero-sum setting, being nice can be bad i built royale: last agent standing, a br for agents, and ran it 30 times the nicest model lost hard. the model you least expected, won 🧵: https://t.co/lEFpfqnIdJ

9retweets46likes
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update