HeadsUpAI

Windsurf Partners With Cerebras to Deliver 1000 Tokens Per Second Coding

Windsurf, an AI-native editor from Cognition, partnered with Cerebras to serve its SWE-1.6 model at 1,000 tokens per second. This "Fast Mode" uses specialized inference hardware to accelerate the model, following a pattern seen in Cognition's SWE-1.6 Fast release for terminal-based agents.
Model
SWE-1.6
Inference speed
1,000 tokens per second
Hardware
Cerebras Inference
Credits per giveaway plan
$250
Availability
Windsurf Editor

The update follows Windsurf 2.0 and its background delegation, though real-time interactive coding still suffers from frontier model latency. Hitting 1,000 tok/s operates alongside Windsurf's parallel agentic coding to reduce the friction of rapid iteration.

To promote the new tier, Windsurf is giving away five plans with $250 in credits each to users who engage with the announcement. This Cerebras-powered mode is a superior alternative for planning and development tasks that demand state-of-the-art capabilities without the typical wait times of high-reasoning models.

Windsurf
Windsurf
@windsurf
X

We've teamed up with @cerebras to offer free Windsurf plans for SWE-1.6 Fast Mode at up to 1000 tok/s! Fast Mode is built on Cerebras inference, enabling superior speed for planning and development with state of the art capabilities.

10retweets168likes
View on X

Still wondering? A few quick answers below.

Windsurf SWE-1.6 Fast Mode is a high-speed version of Cognition's agentic coding model integrated into the Windsurf editor. It is designed for rapid software engineering tasks, including planning and multi-file development. The mode uses specialized inference hardware to minimize the latency typically found in complex AI agent workflows that require multiple reasoning steps.

The SWE-1.6 Fast Mode runs at speeds of up to 1,000 tokens per second. This performance is achieved through a partnership with Cerebras, using their specialized inference infrastructure. This speed is significantly faster than standard cloud-hosted models, allowing the AI agent to generate code and iterate on complex engineering plans almost instantaneously.

Windsurf is currently offering a promotion where five users can win free plans with 250 dollars in credit to access the Cerebras-powered Fast Mode. To qualify, users must reply to the official announcement from Cerebras. While a free version of SWE-1.6 exists at lower speeds, the 1,000 tokens per second tier is a premium offering.

According to the announcement, SWE-1.6 Fast Mode powered by Cerebras offers a clear speed advantage in side-by-side comparisons with models like Claude. This increased throughput allows for more iterations, faster bug fixes, and better overall code quality by reducing the time the AI agent spends in the planning and execution phases of development.

Share this update