We've teamed up with @cerebras to offer free Windsurf plans for SWE-1.6 Fast Mode at up to 1000 tok/s! Fast Mode is built on Cerebras inference, enabling superior speed for planning and development with state of the art capabilities.
Windsurf Partners With Cerebras to Deliver 1000 Tokens Per Second Coding
Windsurf integrated Cerebras inference to power SWE-1.6 Fast Mode, reaching speeds of 1,000 tokens per second for agentic workflows. This performance milestone aims to eliminate the latency bottleneck in multi-step planning and autonomous code generation.
SWE-1.6 model at 1,000 tokens per second. This "Fast Mode" uses specialized inference hardware to accelerate the model, following a pattern seen in Cognition's SWE-1.6 Fast release for terminal-based agents.- Model
- SWE-1.6
- Inference speed
- 1,000 tokens per second
- Hardware
- Cerebras Inference
- Credits per giveaway plan
- $250
- Availability
- Windsurf Editor
The update follows Windsurf 2.0 and its background delegation, though real-time interactive coding still suffers from frontier model latency. Hitting 1,000 tok/s operates alongside Windsurf's parallel agentic coding to reduce the friction of rapid iteration.
To promote the new tier, Windsurf is giving away five plans with $250 in credits each to users who engage with the announcement. This Cerebras-powered mode is a superior alternative for planning and development tasks that demand state-of-the-art capabilities without the typical wait times of high-reasoning models.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →



