OpenRouter launches GPT-5.4 Image 2 to unify frontier reasoning and visual generation

openai/gpt-5.4-image-2. This multimodal model integrates the advanced reasoning of OpenAI's latest frontier model with the high-fidelity output of the GPT Image 2 engine to enable complex visual workflows.This release mirrors the industry shift toward functional precision seen in ChatGPT Images 2.0. By using the OpenAI Responses API, the model acts as an orchestrator that calls an internal image generation tool. This allows GPT-5.4 to technically refine user prompts, ensuring higher adherence to complex instructions.
Access the model via the OpenRouter API by specifying image and text modalities in your requests. Pricing is $8 per million input tokens and $15 per million output tokens, with generated images costing $30 per million tokens. The model features a 272,000-token context window.
Frequently asked questions
- What is GPT-5.4 Image 2?
- GPT-5.4 Image 2 is a multimodal model from OpenAI that combines the reasoning capabilities of GPT-5.4 with the high-quality visual generation of GPT Image 2. It allows users to perform complex tasks involving text, code, and images within a single interaction, using the language model to refine prompts for better visual accuracy.
- How does GPT-5.4 Image 2 generate images via the API?
- The model operates by calling the OpenAI Responses API, where the GPT-5.4 model acts as an orchestrator with access to an Image Generation server tool. When users specify both image and text modalities in their request, the model can generate images from text prompts and return them as base64-encoded data URLs.
- What is the pricing for GPT-5.4 Image 2 on OpenRouter?
- OpenRouter charges $8 per million input tokens and $15 per million output tokens for text processing. For visual tasks, image inputs are processed as part of the prompt, while image outputs are priced at $30 per million tokens. These rates allow developers to access OpenAI's frontier multimodal capabilities through a unified API.
- What are the context window and output limits for this model?
- GPT-5.4 Image 2 features a large 272,000-token context window, which allows it to process extensive documents or complex multi-image prompts in a single session. The model supports a maximum output of 128,000 tokens, providing significant headroom for generating long-form text alongside high-resolution visual assets in multimodal workflows.
- Who can use GPT-5.4 Image 2 on OpenRouter?
- The model is currently live and available to all developers using the OpenRouter platform. Users can access it through the unified API or test it directly in the OpenRouter Chat interface. It is designed for production environments requiring a balance of advanced reasoning, instruction following, and high-quality image generation at scale.

