Cursor Automates RL Training Setup by Bootstrapping Future Models With Past Ones

May 7, 2026

Cursor, an AI-native code editor, introduced autoinstall to bootstrap the training of its Composer agentic coding models. The system uses a teacher model—such as Composer 1.5—to transform unconfigured repository checkouts into stable, runnable environments for reinforcement learning (training via trial and error).

Terminal-Bench score (Composer 2): 61.7%
Terminal-Bench score (Composer 1.5): 47.9%
Autoinstall stages: 2 stages
Maximum setup attempts: 5 repetitions
Proposed commands per environment: 10 commands

Reinforcement learning requires a functional environment to provide a valid reward signal; if a setup is broken, the model wastes compute on debugging rather than solving problems. This update follows Cursor's real-time RL pipeline by moving the bottleneck from human-led environment prep to autonomous agentic engineering.

While this is an internal research update, the results are already visible in the editor: Composer 2 now scores 61.7% on Terminal-Bench, a significant jump from its predecessor. This capability builds on Cursor's agent harness engineering and adds to Cursor 3.3's context usage breakdown to provide more transparent, efficient agentic workflows.

View the full update on cursor.com

Cursor

@cursor_aiMay 6

We use previous generations of Composer to train future ones. Our autoinstall system has earlier Composer models set up dev environments for RL training. That way, the next generation can focus on learning to solve harder problems. https://t.co/GbZILEfhAt

51714

View on X

Still wondering? A few quick answers below.

Composer autoinstall is an automated system that creates runnable development environments from unconfigured code repositories. It uses AI agents to install packages, configure settings, and mock missing dependencies like database tables or files. This ensures that reinforcement learning training happens in a stable environment where the model can receive a clear success signal.

The system operates in two stages. First, a goal-setting agent explores the codebase to propose setup commands and expected outputs. Second, a separate agent attempts to execute those commands, mocking missing components or creating containers as needed. If the environment cannot be verified after five attempts, it is discarded to prevent wasting training compute.

Cursor uses earlier models to handle the plumbing of environment setup so the next generation can focus on solving harder problems. By bootstrapping with a teacher model, the team avoids wasting expensive compute on student models debugging basic configuration issues. This process ensures the training data remains high-quality and the reward signals are accurate.

By training in more reliable environments, Composer 2 achieved a significant performance boost on Terminal-Bench, a benchmark measuring a model's ability to set up developer environments. It scored 61.7 percent, compared to 47.9 percent for the previous version, Composer 1.5. This indicates that the model is becoming more capable of navigating complex real-world project setups.

While autoinstall is primarily a research system for model training, it is inspired by the cloud agents feature already available in Cursor. Cloud agents automate the setup of remote environments for users, allowing AI to work on projects in a stable, mocked environment. The research findings from autoinstall are used to improve these production agent capabilities.