Intelligence at 1000 tokens per second, right in your terminal. Now available with SWE-1.6 Fast, powered by @cerebras. We're giving the first 100 people who respond a free month of Max to try it out. https://t.co/ExGS4bu4YB
Cognition Hits 1000 Tokens Per Second in Devin for Terminal
Cognition· Updated
Cognition released SWE-1.6 Fast for its terminal-based agent, achieving 1,000 tokens per second through a partnership with Cerebras. This speed enables near-instantaneous agentic loops, allowing developers to start tasks locally and hand them off to cloud VMs for persistent execution.
- Inference speed
- 1000 tokens per second
- Hardware partner
- Cerebras
- CLI language
- Rust
- Supported models
- SWE-1.6 Fast, GPT-5.5, Opus 4.7
- Availability
- Devin for Terminal
High-speed inference is critical for agentic coding because agents must observe, reason, and act in iterative loops. By crossing the 1,000-token threshold, Cognition moves toward real-time autonomous engineering, matching Windsurf's Cerebras-powered SWE-1.6 Fast Mode and other hardware-accelerated inference stacks that prioritize low-latency execution.
You can install the Rust-based CLI locally to give the agent direct access to your codebase. The tool supports a hybrid workflow: you initiate a task locally and hand it off to a cloud VM. Beyond the native model, the terminal adds to Devin's GPT-5.5 integration by supporting Opus 4.7.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




