Cursor Trains Composer to Self-Summarize via RL for Longer Coding Tasks

Cursor

Mar 20, 2026 · Updated Apr 25, 2026

Cursor trained Composer, its agentic coding model, to self-summarize through reinforcement learning rather than prompt-based compaction. This cuts context compaction error by 50% while using one-fifth the tokens, letting Composer handle tasks requiring hundreds of turns.

Cursor trained Composer — its agentic coding model — to self-summarize through reinforcement learning. When Composer hits its context-length trigger mid-task, it pauses, generates a ~1,000-token summary, and continues. The RL loop includes this compaction step so the reward covers both agent responses and each self-summary's quality. On CursorBench Hard (40k and 80k triggers), self-summarization cuts compaction error by 50% versus a tuned prompt-based baseline while using one-fifth the tokens.

Standard compaction — prompted summarization or sliding windows — drops critical information as tasks grow longer. Training summarization as a native behavior means Composer carries task state more reliably across hundreds of turns. Cursor demonstrated this on Terminal-Bench 2.0, a command-line coding benchmark: Composer ran 170 turns, condensing 100,000+ tokens to 1,000.

Try Composer on long refactors or debugging sessions where agents typically lose context mid-way. Self-summarization targets exactly those multi-turn, high-token tasks where standard compaction falls short.

View the full update on cursor.com

Cursor

@cursor_aiMar 17

We trained Composer to self-summarize through RL instead of a prompt. This reduces the error from compaction by 50% and allows Composer to succeed on challenging coding tasks requiring hundreds of actions. https://t.co/ryfalZHLZS

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Cursor →

Keep reading

Cursor Releases Composer 2 Technical Report on Coding Agent Training

Cursor published a technical report on Composer 2, a coding agent trained via pretraining on Kimi K2.5 and RL on real engineering tasks. It scores 61.3 on CursorBench — 37% above Composer 1.5 — matching frontier models at lower cost.

Cursor and Fireworks AI Detail the Specialized Training Infrastructure Behind Composer 2.5

Fireworks AIMay 27

Cursor and Fireworks AI Detail the Specialized Training Infrastructure Behind Composer 2.5

Cursor and Fireworks AI shared a technical breakdown of the distributed reinforcement learning infrastructure used to build the Composer 2.5 coding model. The team treats model weights as finite storage bits dedicated entirely to software engineering, allowing the model to match frontier performance at one-tenth the cost. This shift demonstrates how specialized products can use real-world usage as a proprietary training loop.

Cursor Publishes CursorBench, Its Internal Agentic Coding Evaluation Methodology

OpenAIMar 15

Cursor Publishes CursorBench, Its Internal Agentic Coding Evaluation Methodology

Cursor published CursorBench, its internal eval suite that scores models on real coding agent tasks from actual developer sessions. Public benchmarks struggle to differentiate frontier models reliably — CursorBench produces more separation where it matters most.