We trained Composer to self-summarize through RL instead of a prompt. This reduces the error from compaction by 50% and allows Composer to succeed on challenging coding tasks requiring hundreds of actions. https://t.co/ryfalZHLZS
Cursor Trains Composer to Self-Summarize via RL for Longer Coding Tasks
· Updated
Cursor trained Composer, its agentic coding model, to self-summarize through reinforcement learning rather than prompt-based compaction. This cuts context compaction error by 50% while using one-fifth the tokens, letting Composer handle tasks requiring hundreds of turns.
Standard compaction — prompted summarization or sliding windows — drops critical information as tasks grow longer. Training summarization as a native behavior means Composer carries task state more reliably across hundreds of turns. Cursor demonstrated this on Terminal-Bench 2.0, a command-line coding benchmark: Composer ran 170 turns, condensing 100,000+ tokens to 1,000.
Try Composer on long refactors or debugging sessions where agents typically lose context mid-way. Self-summarization targets exactly those multi-turn, high-token tasks where standard compaction falls short.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




