HeadsUpAI

OpenRouter Adds Gemini 3.5 Flash for High Performance Agentic Coding

OpenRouter, a unified API platform for accessing hundreds of LLMs, added support for Google's Gemini 3.5 Flash. This high-efficiency multimodal model is optimized for coding and agentic loops. It features a 1 million token context window and a 65,536 token output limit, supporting text, image, video, audio, and PDF inputs.
Context window
1,048,576 tokens
Max output
65,536 tokens
Pricing (input)
$1.50 per million tokens
Pricing (output)
$9.00 per million tokens
Thinking levels
Minimal, Low, Medium, and more
Input modalities
Text, Image, Video, and more

This release shifts the performance-to-price ratio by beating the older Gemini 3.1 Pro on software engineering benchmarks, mirroring Google's Gemini 3.5 Flash launch. It follows the recent addition of Gemini 3.1 Flash Lite to the platform, but offers higher reasoning depth. Moving frontier-level coding into the Flash tier makes high-volume agentic development economically viable.

You can access the model via the OpenRouter API for $1.50 per million input tokens, matching Vercel's Gemini 3.5 Flash integration. The implementation supports fine-grained thinking levels to balance cost and intelligence, similar to Google's inference tiers. Developers can use the reasoning parameter to inspect the model's internal step-by-step logic.

OpenRouter
OpenRouter
@OpenRouter
X

Gemini 3.5 Flash from @GoogleDeepMind is live on OpenRouter! Beats Gemini 3.1 Pro on coding, agentic work, and tool use at Flash-tier price and speed. 1M context, 65K max output, multimodal. $1.50/M input, $9/M output. https://t.co/VkTuvWz2ey

18retweets282likes
View on X

Still wondering? A few quick answers below.

Gemini 3.5 Flash is a high-efficiency multimodal model from Google DeepMind now available through the OpenRouter API. It is designed to provide near-Pro level reasoning and coding performance while maintaining the speed and low cost of the Flash model tier. It supports text, image, video, audio, and PDF inputs for complex processing tasks.

According to performance data from Google and OpenRouter, Gemini 3.5 Flash outperforms the older Gemini 3.1 Pro model specifically in coding proficiency, agentic workflows, and tool use. This shift allows developers to access frontier-level software engineering capabilities at the significantly lower price point and higher speed typically associated with the Flash series of models.

OpenRouter lists the pricing for Gemini 3.5 Flash at $1.50 per million input tokens and $9.00 per million output tokens. This pricing applies to its 1 million token context window. The platform also supports specialized pricing for other modalities, such as $3.00 per million tokens for audio input and $0.014 per request for web search grounding.

Gemini 3.5 Flash features a massive context window of 1,048,576 tokens, allowing it to process very large documents or entire codebases in a single request. The model also supports a maximum output of 65,536 tokens per response, which is significantly higher than many standard models and useful for generating long-form code or detailed reports.

Gemini 3.5 Flash supports configurable thinking levels, which allow developers to adjust the model's internal reasoning effort. Users can choose between minimal, low, medium, and high levels to balance response speed against intelligence. By default, the model uses a medium thinking effort to provide a cost-effective balance for most general reasoning and coding applications.

Share this update