OpenRouter Adds Cost Quality Slider to Automate Model Selection Expenses

OpenRouter

Jun 1, 2026 · Updated Jun 20, 2026

OpenRouter introduced a new parameter that lets users manually balance model performance against token costs on a 0–10 scale. The update gives developers granular control over how the Auto Router selects from frontier models based on prompt complexity and budget.

OpenRouter released a cost_quality_tradeoff parameter for its Auto Router, an intelligent selection tool powered by NotDiamond. The update introduces a 0–10 scale to weigh model capability against price. A value of 0 prioritizes the most capable model regardless of cost, while 10 forces the cheapest available option.

Parameter: cost_quality_tradeoff
Range: 0 to 10
Default value: 7
Routing engine: NotDiamond
Model pool: GPT-5.1, Claude Sonnet 4.5, Gemini 3.1 Pro, and others

This control addresses the complexity of managing inference (the process of running a trained model to generate outputs) costs. It extends the logic of Auto Exacto and complements the Model Comparison Tool. While Pareto Code targets coding benchmarks, this update provides a general-purpose lever for all prompt types.

Implement the tradeoff by adding the cost_quality_tradeoff integer to the plugins section of an API request or the settings UI. The default is 7, balancing savings with quality. The router selects from a pool including openai/gpt-5.1 and anthropic/claude-sonnet-4.5 at standard rates.

View the full update on openrouter.ai

OpenRouter

@OpenRouterJun 1

The Auto Router now lets you tune how it weighs cost against quality. New `cost_quality_tradeoff` parameter, 0 to 10: Set it to 0 and it always picks the most capable model regardless of price. Set it to 10 and the cheapest model wins. https://t.co/nWr4AuwMC2

345

View on X

Still wondering? A few quick answers below.

The Auto Router is an intelligent model selection tool that automatically chooses the best LLM for a specific prompt. It analyzes factors like task type and prompt complexity to route requests to a curated set of high-performance models, including options from OpenAI, Anthropic, and Google.

The parameter uses a 0–10 scale to define routing priorities. A setting of 0 ignores price to select the most capable model available, while a setting of 10 prioritizes the lowest cost. Intermediate values allow users to find an optimal balance for their specific use case.

No, there is no additional fee for using the Auto Router. Users simply pay the standard per-token rate for whichever model the system selects to handle the request. This allows for dynamic cost optimization without adding overhead to the existing OpenRouter pricing structure.

Yes, users can configure allowed models using wildcard patterns in the API request or the settings UI. For example, you can limit the router to only select from Anthropic models or specific versions of GPT, ensuring the system stays within preferred provider or capability boundaries.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from OpenRouter →

Keep reading

OpenRouter Reaches 13B Daily Tokens as Automated Model Routing Scales

OpenRouter's automated routing engines now process 13 billion tokens daily, with the coding-specific Pareto Router hitting 1 billion. The milestone coincides with new granular controls that let users manually balance model performance against token costs. This shift highlights how developers are moving from static model selection to dynamic, algorithmic orchestration to manage AI expenses.

WarpMay 5

Warp Launches Open Weight Model Routing to Optimize Agentic Development Costs

Warp introduced an auto (open-weights) setting that routes terminal tasks to frontier-level open models based on request complexity. This update allows developers to maintain high performance during autonomous workflows while significantly reducing token expenses compared to the platform's high-end genius mode.

What is the OpenRouter Auto Router?

How does the cost quality tradeoff parameter work?

Does using the Auto Router increase API costs?

Can I restrict which models the Auto Router selects?

Keep reading

OpenRouter Reaches 13B Daily Tokens as Automated Model Routing Scales

OpenRouter Reaches 13B Daily Tokens as Automated Model Routing Scales

Warp Launches Open Weight Model Routing to Optimize Agentic Development Costs

Warp Launches Open Weight Model Routing to Optimize Agentic Development Costs

Keep reading

OpenRouter Reaches 13B Daily Tokens as Automated Model Routing Scales

OpenRouter Reaches 13B Daily Tokens as Automated Model Routing Scales

Warp Launches Open Weight Model Routing to Optimize Agentic Development Costs

Warp Launches Open Weight Model Routing to Optimize Agentic Development Costs