Vercel Index Reveals Agentic Workloads Drive Majority of Production AI Traffic

Vercel

May 14, 2026 · Updated Jun 13, 2026

Vercel's AI Gateway production index shows that agentic workloads now account for nearly 60 percent of all token volume, doubling in just six months. The report highlights a shift toward multi-model architectures where high-volume teams route tasks across an average of 35 distinct models.

Vercel, the frontend cloud platform and AI SDK creator, released its AI Gateway production index based on traffic from 200,000 teams. Agentic AI now carries 58.9 percent of all token volume. This shift extends Vercel Workflows, which helped double agentic traffic since late 2025.

Agentic token volume: 58.9%
High-volume team fleet size: 35 models (average)
Anthropic spend share: 61%
Google token volume share: 38%
Request fallback rate: 3.5%
B2B vs B2C token cost: 2x higher (average)

The report shows labs winning specific layers of the same application. Anthropic captures 61 percent of spend by handling high-stakes reasoning, which follows the launch of Claude Opus 4.7 high-speed tier. This diversification builds on Google Gemma 4 integration and adds to GPT-5.5 support to drive multi-model adoption.

Design for a multi-model fleet to optimize workloads. High-volume teams use an average of 35 models to manage the cost of being wrong, paying more for accuracy in B2B contexts. You can implement automated fallbacks to protect uptime, as 3.5 percent of requests rely on these rescues.

View the full update on vercel.com

Vercel

@vercelMay 13

https://t.co/mwlAoCno2r

32204

View on X

Still wondering? A few quick answers below.

The Vercel AI Gateway production index is a report based on seven months of production traffic data from over 200,000 unique teams. It analyzes how tens of trillions of tokens are routed, spent, and consumed across hundreds of models. Unlike benchmarks, it provides a real-world view of model adoption and usage patterns in live applications.

Agentic workloads, which involve AI models using tools or calling APIs to complete tasks, now account for 58.9 percent of all token volume. This is a significant increase from 31.6 percent just six months ago. These requests are roughly 2.6 times more token-heavy than standard chat interactions because they often involve multi-step chains.

Anthropic leads the market in spend with a 61 percent share, as teams use its high-reasoning models for quality-critical tasks. Google leads in token volume with a 38 percent share, driven by the adoption of Gemini Flash for low-cost, high-frequency workloads. This shows that different providers are winning different layers of the same applications.

While smaller teams use an average of three models, high-volume organizations with over 10 million requests use an average of 35 distinct models. These teams treat models as swappable components in a routing graph, using different models for specific tasks like intent detection, reasoning, and summarization to optimize for cost and performance across their entire application.

Approximately 3.5 percent of requests on Vercel's AI Gateway rely on automated fallbacks to complete successfully. These fallbacks trigger when an initial request hits an error, rate limit, or timeout. Failures are more common in expensive, high-reasoning calls, making fallbacks essential for maintaining uptime in complex agentic workflows and large-scale production environments.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Vercel →

Keep reading

Vercel AI Gateway Recovers Over 1 Trillion Tokens Monthly with Built-in Redundancy

Vercel announced its AI Gateway recovers over 1 trillion tokens monthly by providing redundancy and failover mechanisms. The service offers zero markup over model providers, along with zero-data retention enforcement, observability, usage APIs, and caps. This positions the gateway as a resilient and cost-effective solution for deploying AI applications.

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access

VercelMay 20

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access

Vercel launched an AI Gateway plugin for WordPress 7.0 that connects sites to hundreds of models from 40+ providers through a single API key. This integration moves AI infrastructure management out of individual plugins and into a centralized layer with built-in fallbacks and observability.

What is the Vercel AI Gateway production index?

How much of production AI traffic is now agentic?

Which AI model providers lead in spend versus volume?

How many models do production AI teams use at scale?

What is the role of fallbacks in production AI applications?

Keep reading

Vercel AI Gateway Recovers Over 1 Trillion Tokens Monthly with Built-in Redundancy

Vercel AI Gateway Recovers Over 1 Trillion Tokens Monthly with Built-in Redundancy

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access

Keep reading

Vercel AI Gateway Recovers Over 1 Trillion Tokens Monthly with Built-in Redundancy

Vercel AI Gateway Recovers Over 1 Trillion Tokens Monthly with Built-in Redundancy

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access

Vercel Brings AI Gateway to WordPress for Unified Multi-Model Access