Now you can use Gemma directly in the Gemini CLI! 🚀 v0.40.0 introduces experimental support for local Gemma models, starting with intelligent model routing (with full local execution on the roadmap!). https://t.co/8tuZSNgaDP
Google Gemini CLI Integrates Local Gemma Models for Intelligent Task Routing
· Updated
Google released Gemini CLI v0.40.0, introducing experimental support for running Gemma models on local hardware. The update enables intelligent model routing—using a local model to direct tasks—to analyze intent locally. This follows the recent launch of Gemma 4 frontier reasoning.
This shift addresses the high latency and cost of using cloud-based models for minor agentic decisions. By using a local router, the CLI handles task decomposition and tool selection instantly without API fees. It mirrors an industry move toward hybrid architectures that balance local privacy with cloud-scale intelligence.
You can now use the gemini gemma command to set up local model integration. While currently limited to routing decisions, the roadmap includes full local execution for agentic tasks. The update is available now as an open-source tool, providing a private alternative to cloud-only coding assistants.
Google Gemma
@googlegemma
30retweets334likes
View on XStill wondering? A few quick answers below.
Gemini CLI is an open-source terminal-based AI agent developed by Google. It allows developers to interact with Gemini models directly from their command line to perform tasks like navigating codebases, running terminal commands, and executing multi-step agentic workflows. It serves as a developer-focused interface for Google's frontier AI models.
Local model routing uses a locally running Gemma model to analyze a user's intent before sending a request to the cloud. The local model decides how to route the task, which reduces latency and API costs for simple decisions. This hybrid approach keeps the control logic on the user's hardware while reserving cloud compute for complex reasoning.
Yes, Gemini CLI is an open-source project published by Google. Unlike some proprietary AI coding assistants, its source code is publicly available, allowing developers to inspect, modify, and extend the tool. This open nature has led to community interest in forking the project to support various local and third-party models beyond Google's ecosystem.
To set up local Gemma models in Gemini CLI v0.40.0, you can use the new gemini gemma command. This streamlined setup process is designed to integrate locally running models into the CLI's workflow. Once configured, the CLI can use these local models for experimental features like intelligent routing instead of relying entirely on cloud-based inference.
Currently, Gemini CLI v0.40.0 only supports local Gemma models for intelligent routing decisions. However, the official roadmap includes plans for full local execution. This future capability would allow the agent to complete entire tasks and execute code directly on the user's machine without needing to connect to external cloud APIs for any part of the process.




