Cursor Releases Composer 2 Technical Report on Coding Agent Training

Cursor

Mar 26, 2026 · Updated Apr 25, 2026

Cursor published a technical report on Composer 2, a coding agent trained via pretraining on Kimi K2.5 and RL on real engineering tasks. It scores 61.3 on CursorBench — 37% above Composer 1.5 — matching frontier models at lower cost.

Cursor published a technical report on Composer 2, a coding agent trained in two phases: continued pretraining from Kimi K2.5 — a 1.04T parameter, 32B active MoE model — followed by RL on real engineering tasks run in the same harness and environments used in production.

Domain-specialized training reaches frontier coding performance at lower inference cost. Composer 2 scores 61.3 on CursorBench — 37% above Composer 1.5 and 70% above base Kimi K2.5 — plus 73.7 on SWE-bench Multilingual and 61.7 on Terminal-Bench, competitive with Opus 4.6 High, at a cost-per-task comparable to smaller variants.

CursorBench tasks have a median of 181 lines changed versus 7–10 for SWE-bench Verified — built from real engineering sessions rather than curated bug fixes. If you follow coding agent development, the benchmark design choices here are as informative as the scores.

View the full update on cursor.com

Cursor

@cursor_aiMar 24

We're releasing a technical report describing how Composer 2 was trained. https://t.co/cfW8lyMWEy

364

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Cursor →

Keep reading

Cursor Trains Composer to Self-Summarize via RL for Longer Coding Tasks

Cursor trained Composer, its agentic coding model, to self-summarize through reinforcement learning rather than prompt-based compaction. This cuts context compaction error by 50% while using one-fifth the tokens, letting Composer handle tasks requiring hundreds of turns.

Cursor Publishes CursorBench, Its Internal Agentic Coding Evaluation Methodology

OpenAIMar 15

Cursor Publishes CursorBench, Its Internal Agentic Coding Evaluation Methodology

Cursor published CursorBench, its internal eval suite that scores models on real coding agent tasks from actual developer sessions. Public benchmarks struggle to differentiate frontier models reliably — CursorBench produces more separation where it matters most.

Cursor and Fireworks AI Detail the Specialized Training Infrastructure Behind Composer 2.5

Fireworks AIMay 27

Cursor and Fireworks AI Detail the Specialized Training Infrastructure Behind Composer 2.5

Cursor and Fireworks AI shared a technical breakdown of the distributed reinforcement learning infrastructure used to build the Composer 2.5 coding model. The team treats model weights as finite storage bits dedicated entirely to software engineering, allowing the model to match frontier performance at one-tenth the cost. This shift demonstrates how specialized products can use real-world usage as a proprietary training loop.