HeadsUpAI

StepFun Launches Step 3.7 Flash With Open Weights and Agent Advisor Mode

StepFun, an AI lab focused on large-scale foundation models, launched Step 3.7 Flash with open weights under the Apache 2.0 license. This Mixture-of-Experts (MoE) model (an architecture activating only a subset of parameters per request) uses 196B total parameters with 11B active to deliver 400 tokens per second.
Total parameters
196B
Active parameters
11B
Throughput
400 TPS
License
Apache 2.0
ClawEval-1.1 score
67.1

The release targets the demand for agentic efficiency, where speed is as critical as reasoning. It joins high-speed models like DeepSeek's V4 Flash launch and Moonshot AI's Kimi K2.6 release. By ranking #1 on the ClawEval-1.1 benchmark, it proves that Flash-tier models can now handle complex, multi-step autonomous workflows.

You can deploy Step 3.7 Flash for coding and GUI-based tasks using its native vision capabilities. Its Advisor Mode allows the model to act as an executor that consults a larger advisor only at difficult inflection points, reducing per-task costs by nearly 90%. Weights are available on Hugging Face.

StepFun
StepFun
@StepFun_ai
X

⚡️ Step 3.7 Flash is here: The new frontier is agent efficiency. #1 ClawEval-1.1 (67.1), #1 SimpleVQA Search (79.2), #2 SWE-PRO (56.3), 95.3 on V* Python. Open weights under Apache 2.0. Built for agentic, coding, search, and multimodal workflows — balancing speed, cost, and https://t.co/mzLi1HnkxU

193retweets1.4klikes
View on X

Still wondering? A few quick answers below.

Step 3.7 Flash is a high-efficiency multimodal foundation model developed by StepFun. It uses a Mixture of Experts architecture with 196 billion total parameters and 11 billion active parameters. The model is specifically optimized for agentic workflows, balancing frontier-level reasoning with high-speed performance of up to 400 tokens per second for autonomous tasks.

Step 3.7 Flash is released as an open-weight model under the Apache 2.0 license. This allows developers to download, deploy, and build upon the model weights for commercial or research purposes. The model is currently available for access through Hugging Face, ModelScope, and the official StepFun Open Platform for various deployment scenarios.

Advisor Mode is a strategy where Step 3.7 Flash acts as a primary executor to handle the majority of a task. It only consults a larger, more capable advisor model at critical inflection points, such as complex planning or error recovery. This approach allows the system to achieve near-frontier performance levels at significantly lower operational costs.

Step 3.7 Flash ranks first on the ClawEval-1.1 benchmark for autonomous task execution with a score of 67.1. It also leads the SimpleVQA Search benchmark for visual tool use at 79.2. In agentic coding, it achieved a score of 56.3 on SWE-Bench Pro, outperforming several larger models in the same category.

You can access Step 3.7 Flash through the StepFun Open Platform API or partner platforms like OpenRouter and NVIDIA NIM. For local deployment, the model supports vLLM and llama.cpp. It requires high-memory hardware, such as NVIDIA DGX systems or Mac Studio devices with at least 128GB of unified memory to run effectively.

Share this update