HeadsUpAI

Cloudflare adds MiniMax M3 with 1M context for agentic coding

Cloudflare integrated the MiniMax M3 launch foundation model into its AI Gateway, enabling native multimodal processing and a 1-million-token context window (the data a model processes at once). The model is specialized for agentic coding (AI that autonomously writes, tests, and debugs code) and is now accessible through a single fetch call.
Model
MiniMax M3
Context Window
1,000,000 tokens
Platform
Cloudflare AI Gateway

Hosting a model with frontier coding performance provides a specialized alternative to general-purpose proprietary systems. This move follows a pattern of high-context models reaching developer gateways, such as the Qwen3.5-Plus update on Vercel, and joins Vercel's MiniMax M3 integration in expanding access to 1M-context workflows. It positions M3 as a competitor to other 1M-context models like the DeepSeek-V4 release that target autonomous development across massive codebases.

Developers can access minimax-m3 via the AI Gateway now. This availability targets teams building agentic workflows that require processing large repositories or long-form documents without managing their own model infrastructure.

Still wondering? A few quick answers below.

MiniMax M3 is a natively multimodal foundation model designed for high-performance coding and agentic tasks. It features a massive 1-million-token context window, allowing it to process entire codebases or long documents in a single interaction. The model is available as an open-weight release for broad developer use.

Developers can access MiniMax M3 through the Cloudflare AI Gateway. By using a single fetch call, users can route their AI requests to the M3 model while benefiting from Cloudflare's management features, including rate limiting and caching, without needing to manage the underlying model infrastructure themselves.

To celebrate the launch, Cloudflare is offering a 50% discount on MiniMax M3 inference costs during the first week of availability. This promotion applies specifically to requests with a context size of 512,000 tokens or less, encouraging developers to test the model's performance on mid-to-large scale tasks.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update