Clicking the "Approve permission" button is difficult. We show that agents can do that for you. Check out our alignment blog: https://t.co/2jwUmSws8y https://t.co/9eEHfhA4TH
OpenAI Open Sources Auto-review to Automate Safety Checks for Codex Agents
· Updated
OpenAI released the research and code for Auto-review, a secondary agent that handles permission requests for Codex without requiring human intervention. This architecture allows autonomous coding agents to perform sensitive tasks like network calls while maintaining safety oversight through a separate reasoning model.
GPT-5.4 Thinking—to evaluate permission requests when the primary agent attempts actions outside its restricted sandbox (a secure, isolated environment). This model replaces the need for constant manual human approval.- Reviewer model
- GPT-5.4 Thinking (low reasoning)
- Human interruption reduction
- 200x
- Prompt injection recall
- 99.3%
- Overreach recall
- 90.3%
- Recovery rate after rejection
- >50%
- Availability
- Open source (Codex repository)
This shift addresses the "approval bottleneck" that prevents unattended Codex agent workflows from completing long-running background tasks. It mirrors Claude Code's new auto mode, which also uses per-action safety classification. Delegating oversight to a separate model maintains safety standards without sacrificing the productivity of autonomous agents.
You can now access the Auto-review logic in the open-source Codex repository. Internal data shows the system catches 99.3% of prompt injections while reducing human interruptions by 200x. It provides a safer default for agents interacting with external networks or sensitive file systems without constant human oversight.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →
