We use previous generations of Composer to train future ones. Our autoinstall system has earlier Composer models set up dev environments for RL training. That way, the next generation can focus on learning to solve harder problems. https://t.co/GbZILEfhAt
Cursor Automates RL Training Setup by Bootstrapping Future Models With Past Ones
Cursor revealed its autoinstall system, which uses previous generations of its Composer model to automatically set up and verify runnable development environments for reinforcement learning training. By using AI to handle the plumbing of environment configuration and dependency mocking, the team ensures that student models receive a clean reward signal instead of wasting compute debugging setup failures.
Composer 1.5—to transform unconfigured repository checkouts into stable, runnable environments for reinforcement learning (training via trial and error).- Terminal-Bench score (Composer 2)
- 61.7%
- Terminal-Bench score (Composer 1.5)
- 47.9%
- Autoinstall stages
- 2 stages
- Maximum setup attempts
- 5 repetitions
- Proposed commands per environment
- 10 commands
Reinforcement learning requires a functional environment to provide a valid reward signal; if a setup is broken, the model wastes compute on debugging rather than solving problems. This update follows Cursor's real-time RL pipeline by moving the bottleneck from human-led environment prep to autonomous agentic engineering.
While this is an internal research update, the results are already visible in the editor: Composer 2 now scores 61.7% on Terminal-Bench, a significant jump from its predecessor. This capability builds on Cursor's agent harness engineering and adds to Cursor 3.3's context usage breakdown to provide more transparent, efficient agentic workflows.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




