Kimi K2.5 is now available on #WorkersAI. You can now build and run agents end-to-end on the Cloudflare Developer Platform. Read about how we tuned our inference stack to drive down costs for internal agent workflows. https://t.co/kEQ6HHpoJS
Cloudflare Workers AI Adds Kimi K2.5 for End-to-End Agent Workflows
Cloudflare· Updated
Cloudflare's Workers AI now supports Kimi K2.5, Moonshot AI's frontier open-source model with a 256k context window. Developers can build and run full agent workflows on Cloudflare's platform, with prefix caching and a new async API cutting inference costs.
Cloudflare's internal security review agent, processing over 7 billion tokens per day, ran 77% cheaper after switching to Kimi K2.5 compared to a mid-tier proprietary model. As agentic workloads scale, open-source frontier models with this price-performance ratio become the practical path for enterprises running high-volume inference.
The model is available as @cf/moonshotai/kimi-k2.5 on Workers AI. A new x-session-affinity header improves prefix cache hit rates for multi-turn sessions, and a revamped async API handles batch inference without capacity errors.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →





