We’re seeing lots of interest in how Cursor delivered Composer 2. One less obvious insight: you don't need to spend billions on a giant cluster to do reinforcement learning. With disaggregated sampling, we ran @Cursor_ai Composer 2 training across 3-4 clusters worldwide, with a unified capacity of Fireworks Virtual Cloud. Check how we optimize cross-region 1TB+ model updates by 98%+ while keeping staleness under a few minutes: https://t.co/0Ziv6ssFNx
Fireworks AI Powers Cursor Composer 2 With Distributed Global RL Infrastructure
Fireworks AI· Updated
Fireworks AI revealed the infrastructure behind Cursor's Composer 2, using disaggregated sampling to run RL across multiple global clusters. By shipping only 2% of model weights as compressed deltas, they eliminated the need for a single massive mega-cluster. This shift makes frontier-scale RL training economically viable using fragmented, multi-region GPU capacity.
bf16 remain bit-equivalent. Instead of transferring a full 1TB model, the system sends a 20GB compressed delta, reducing cross-region traffic by 98% while maintaining exact reconstruction.This approach challenges the assumption that frontier RL requires a single, co-located mega-cluster. By making policy updates small, teams can use fragmented GPU capacity across different regions. Cursor used this to train Composer 2 across four global clusters, turning distributed inference into a unified pool for generating training data.
You can implement this via the Fireworks Training SDK, which supports fully managed RL or a "bring your own trainer" model. The platform provides OpenAI-compatible sampling endpoints and a weight update API. These tools bound policy staleness to a few minutes and keep in-memory GPU swaps under 60 seconds.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →


