Now you can use Gemma directly in the Gemini CLI! 🚀 v0.40.0 introduces experimental support for local Gemma models, starting with intelligent model routing (with full local execution on the roadmap!). https://t.co/8tuZSNgaDP
Google Gemini CLI Integrates Local Gemma Models for Intelligent Task Routing
· Updated
Gemini CLI v0.40.0 introduces experimental support for running Gemma models locally to handle intelligent routing decisions. By offloading intent analysis to the user's hardware, the agent reduces cloud API dependency and latency for simple tasks. This marks the first step toward a roadmap of full local execution for Google's terminal-based agent.
This shift addresses the high latency and cost of using cloud-based models for minor agentic decisions. By using a local router, the CLI handles task decomposition and tool selection instantly without API fees. It mirrors an industry move toward hybrid architectures that balance local privacy with cloud-scale intelligence.
You can now use the gemini gemma command to set up local model integration. While currently limited to routing decisions, the roadmap includes full local execution for agentic tasks. The update is available now as an open-source tool, providing a private alternative to cloud-only coding assistants.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




