New on the Anthropic Engineering Blog: We give prospective performance engineering candidates a notoriously difficult take-home exam. It worked well—until Opus 4.5 beat it. Here's how we designed (and redesigned) it: https://t.co/3RZVyhpVij
Anthropic Releases Hiring Exam Claude Keeps Beating as Open Engineering Challenge
Anthropic· Updated
Anthropic's performance engineering take-home, completed by 1,000+ candidates since 2024, has been defeated by successive Claude models - forcing three complete redesigns. They're releasing the original exam on GitHub, where the fastest human solution still beats Claude's best after hours of compute.
Anthropic redesigned the exam three times. The current version uses Zachtronics-inspired puzzles with constrained instruction sets - problems sufficiently novel that models can't draw on training data. The takeaway: domain-specific problems fall to models with extensive training coverage. Novel, constrained puzzles are where human reasoning still wins.
The original exam is open-sourced. Score below 1,487 cycles and email performance-recruiting@anthropic.com with your code and resume.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

