The new Qwen3.7-Max from @Alibaba_Qwen is live on OpenRouter. The flagship of the Qwen3.7 series, built for agent-centric work: coding, office and productivity tasks, and long-horizon autonomous execution. Big jumps in coding and agent benchmarks over Qwen3.6, with explicit prompt caching for repeated context.
OpenRouter Adds Qwen3.7-Max for Long Horizon Agentic Coding and Office Tasks
· Updated
OpenRouter integrated Alibaba's Qwen3.7-Max, a flagship model optimized for autonomous agent loops and multi-hour task execution. The update introduces explicit prompt caching for the Qwen series, allowing developers to maintain massive context windows at a 90 percent discount on subsequent requests.
- Cache read price
- 0.1x base input
- Cache write price
- 1.25x base input
- Cache TTL
- 5 minutes
- Availability
- OpenRouter API
- Supported models
- Qwen3.7-Max, Qwen-Plus, Qwen3.6-Plus, and others
The update addresses the economic bottleneck of autonomous workflows where agents repeatedly process large codebases. By supporting explicit breakpoints, the model allows for a 90% reduction in costs for cached reads. This follows the Alibaba Qwen3.7-Max launch and complements OpenRouter's Long Horizon Primitives.
You can access qwen/qwen3-max via the OpenRouter API using standard cache_control parameters. To maximize cache hits, the platform uses provider sticky routing to keep subsequent requests on the same endpoint. Caching is also available for other models on the platform, with pricing for Qwen set at 1.25x for writes and 0.1x for reads.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →





