HeadsUpAI

Cursor and Fireworks AI Detail the Specialized Training Infrastructure Behind Composer 2.5

Fireworks AI, an inference platform for fast model serving, detailed the engineering behind Composer 2.5 following the Cursor Composer 2.5 product release. Composer treats weights as finite storage bits dedicated solely to coding. This specialization, plus pipelined reinforcement learning, matches frontier performance at 10x lower cost than Claude Opus.
Pricing (input)
$0.50 per million tokens
Pricing (output)
$2.50 per million tokens
Weight sync speed
Under 1 minute for 1TB
Compression ratio
20x for weight transfers
Update frequency
Every few hours

This approach solves scaling bottlenecks in distributed reinforcement learning. The team used delta compression to sync 1TB of weights across global clusters in under a minute. They also introduced "router replay" to fix numerical divergence in Mixture of Experts models, ensuring training and inference workers activate the same experts.

Cursor now uses real-time reinforcement learning to ship model updates every few hours. This turns the product into a proprietary training environment. Building on the Cursor Composer 2 technical report, the model is available now for users at $0.50 per million input tokens.

Fireworks AI
Fireworks AI
@FireworksAI_HQ
X

1/ Composer 2.5 is having a moment. Worth a look at how the team actually got here. @cursor_ai's Federico Cassano and @FireworksAI_HQ cofounder Dima Dzhulgakov discussed Training Data with @sonyatweetybird. The whole episode is worth your time, but we’ll break it down here.

3retweets47likes
View on X

Still wondering? A few quick answers below.

Composer 2.5 is a specialized agentic coding model developed by Cursor that autonomously writes, tests, and iterates on software across complex codebases. Unlike general-purpose models, it is trained specifically for engineering tasks. This specialization allows it to match the performance of frontier models like Claude Opus while operating at one-tenth the cost.

Cursor uses a top-down training approach that combines mid-training on code with large-scale reinforcement learning. They implement pipelined reinforcement learning, which allows training and data collection to happen simultaneously. This method maximizes GPU utilization and enables the team to ship updated versions of the model to users every few hours based on real-world usage.

The model is designed to be significantly more cost-effective than general-purpose frontier models. It is currently priced at $0.50 per million input tokens and $2.50 per million output tokens. This lower price point is made possible by dedicating the model's finite weight capacity entirely to software engineering tasks rather than general-world knowledge.

Cursor uses a custom infrastructure built on Fireworks AI to sync 1TB of model weights across global clusters in under a minute. They use a lossless delta compression scheme to shrink data transfers by 20x. They also use a technique called router replay to prevent numerical errors that can cause training to fail in distributed environments.

Cursor specializes its models because it views model weights as a finite storage drive with limited bits. By intentionally excluding general-world information and focusing all capacity on coding, the model becomes more intelligent and efficient at specific engineering tasks. This specialization creates a proprietary moat by turning actual product usage into a continuous training loop.

Share this update