https://t.co/mwlAoCno2r
Vercel Index Reveals Agentic Workloads Drive Majority of Production AI Traffic
Vercel· Updated
Vercel's AI Gateway production index shows that agentic workloads now account for nearly 60 percent of all token volume, doubling in just six months. The report highlights a shift toward multi-model architectures where high-volume teams route tasks across an average of 35 distinct models.
- Agentic token volume
- 58.9%
- High-volume team fleet size
- 35 models (average)
- Anthropic spend share
- 61%
- Google token volume share
- 38%
- Request fallback rate
- 3.5%
- B2B vs B2C token cost
- 2x higher (average)
The report shows labs winning specific layers of the same application. Anthropic captures 61 percent of spend by handling high-stakes reasoning, which follows the launch of Claude Opus 4.7 high-speed tier. This diversification builds on Google Gemma 4 integration and adds to GPT-5.5 support to drive multi-model adoption.
Design for a multi-model fleet to optimize workloads. High-volume teams use an average of 35 models to manage the cost of being wrong, paying more for accuracy in B2B contexts. You can implement automated fallbacks to protect uptime, as 3.5 percent of requests rely on these rescues.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

