Arena Ranks Google Gemma 4 as Top Open Vision Model

Arena

May 8, 2026 · Updated Jun 5, 2026

Google's Gemma-4-31b and Gemma-4-26b-a4b have entered the Vision Arena leaderboard as the #2 and #4 ranked open models. These releases shift the price-performance frontier by delivering vision reasoning capabilities that rival proprietary systems at a fraction of the cost.

Arena, a community-driven platform for evaluating AI models through human preference voting, added Google's Gemma-4 family to its Vision Arena leaderboard. The gemma-4-31b model debuted as the #2 open-weight model. These multimodal models (AI that processes text and images simultaneously) are designed for advanced reasoning and agentic workflows.

Gemma-4-31b rank: #2 open, #20 overall
Gemma-4-26b-a4b rank: #4 open, #26 overall
License: Apache 2.0
Pricing (31b input): $0.14 per million tokens
Pricing (31b output): $0.40 per million tokens
Context window: 262.1K tokens

The rankings confirm that open-weight models are closing the proprietary performance gap in visual reasoning. By outperforming several versions of GPT-4o, Gemma-4 shifts the Pareto frontier—the optimal balance between cost and capability. This follows Gemma 4's manual visual token controls and Arena's agentic coding leaderboard sweep.

You can now deploy these models on private hardware under an Apache 2.0 license, making high-tier vision reasoning viable for local agentic loops. The 31b model is priced at $0.14 per million input tokens. This enables native multimodal capabilities in applications without the latency or privacy constraints of closed-source APIs.

View the full update on arena.ai

Arena.ai

@arenaMay 7

Gemma-4 lands in Vision Arena as #2 & #4 open models, and shifts the Pareto frontier! @GoogleDeepMind dominates the price-performance Pareto in Vision across both proprietary and open models. - Gemma-4-31b ranks #2 open (#20 overall) - Gemma-4-26b-a4b ranks #4 open (#26 overall) The Vision Arena ranks multimodal AI models capable of reasoning over visual inputs. Congrats to @GoogleDeepMind again on the open model progress!

9126

View on X

Still wondering? A few quick answers below.

The Vision Arena is a community-driven evaluation platform that ranks multimodal AI models based on their ability to reason over visual inputs. It uses a blind human preference voting system where users compare model outputs to determine Elo ratings. As of May 2026, the leaderboard includes over 120 proprietary and open-weight models.

Google's Gemma-4-31b debuted as the number two open-weight model and ranked twentieth overall. The smaller Gemma-4-26b-a4b version entered as the fourth-ranked open model and twenty-sixth overall. These rankings place Gemma 4 ahead of several proprietary frontier models, including specific versions of GPT-4o and Gemini 2.0 Flash.

Gemma 4 is a family of open-weight models released under the Apache 2.0 license. This licensing allows developers and researchers to download, run, and customize the models on their own hardware. Unlike proprietary models that require cloud API access, Gemma 4 provides the flexibility of local deployment for advanced reasoning and agentic workflows.

While Gemma 4 can be run locally for free, it is also available via API. The Gemma-4-31b model is priced at 0.14 dollars per million input tokens and 0.40 dollars per million output tokens. This pricing structure is significantly lower than many proprietary frontier models that offer comparable performance on visual reasoning tasks.

The Gemma 4 models feature a context window of 262,144 tokens, allowing them to process large amounts of information in a single request. They are natively multimodal, meaning they can reason across text and visual data simultaneously. The family includes a 31-billion parameter version and a 26-billion parameter version optimized for efficiency.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Arena →

Keep reading

Arena.ai Ranks Google Gemini 3.5 Flash in Top Ten for Coding

Gemini 3.5 Flash has entered the Arena.ai leaderboards with a ninth-place ranking in both the overall Text and Frontend Coding categories. The model establishes a new price-performance frontier by delivering a 70-point jump in coding capability over its predecessor.

GoogleApr 27

Google Launches Gemma 4 to Bring Frontier Reasoning to Local Devices

Google released Gemma 4, a new family of open models built on the same architecture as Gemini 3 and licensed under Apache 2.0. These models deliver high-performance reasoning and native multimodal capabilities directly on consumer hardware, enabling private, offline agentic workflows. This shift allows developers to build sophisticated AI applications that run entirely on-device without sacrificing intelligence.

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Google GemmaMay 12

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Google confirmed that its Gemma 4 open-weight model achieved high-ranking results on the Swallow Leaderboard v2, a rigorous Japanese language benchmark. This validation establishes the model as a leading choice for developers building regional applications that require frontier-level reasoning in Japanese.

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows

VercelApr 2

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows

Vercel now supports Google's Gemma 4 models on its AI Gateway, offering native function calling and structured JSON output for building autonomous agents. These 26B and 31B models feature a 256K context window and are built on the same architecture as Gemini 3. This integration allows developers to deploy high-performance open models with enterprise-grade reliability and no price markup.

What is the Vision Arena leaderboard?

How did the Google Gemma 4 models rank on the Vision Arena?

Is Google Gemma 4 open source?

What is the pricing for Google Gemma 4?

What are the technical specifications of the Gemma 4 models?

Keep reading

Arena.ai Ranks Google Gemini 3.5 Flash in Top Ten for Coding

Arena.ai Ranks Google Gemini 3.5 Flash in Top Ten for Coding

Google Launches Gemma 4 to Bring Frontier Reasoning to Local Devices

Google Launches Gemma 4 to Bring Frontier Reasoning to Local Devices

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows

Keep reading

Arena.ai Ranks Google Gemini 3.5 Flash in Top Ten for Coding

Arena.ai Ranks Google Gemini 3.5 Flash in Top Ten for Coding

Google Launches Gemma 4 to Bring Frontier Reasoning to Local Devices

Google Launches Gemma 4 to Bring Frontier Reasoning to Local Devices

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Google Gemma 4 Claims Top Rankings on Japanese Swallow Leaderboard v2

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows

Vercel brings Google Gemma 4 to AI Gateway for high-performance agentic workflows