.@GoogleDeepMind's Gemma 4 - 12B is available on Ollama! Chat: ollama run gemma4:12b-mlx Hermes Agent: ollama launch hermes --model gemma4:12b-mlx Claude Code: ollama launch claude --model gemma4:12b-mlx and more ๐๐๐ (Note, this currently works via MLX) https://t.co/BWmHT9w33m
Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI
OllamaOllama has made Google DeepMind's Gemma 4 12B model available for local execution, including support for chat and agentic applications. This expands access to a powerful, open-weight multimodal model optimized for on-device reasoning and coding, enabling private and offline AI workflows on consumer hardware.
gemma4:12b-mlx variant was initially available via MLX, with the gemma4:12b model now accessible across all platforms.- Local Size
- 7.6GB
- Variants
- gemma4:12b, gemma4:12b-mlx
- Agentic Apps
- Hermes Agent, Claude Code, and more
- Access
- ollama run (chat), ollama launch (agents)
The Gemma 4 family of models, built by Google DeepMind, is designed for frontier-level performance in reasoning, agentic workflows, coding, and multimodal understanding. These models are optimized for efficient local execution on consumer hardware, bringing advanced capabilities like native function-calling and system prompt support directly to users' machines. This aligns with a broader trend of making powerful AI models available for on-device, private, and offline use.
The Gemma 4 12B model can be run locally via Ollama using simple commands, allowing for direct chat interactions or integration into agentic tools. This enables developers and users to build and experiment with sophisticated AI applications that leverage Google's Gemma 4 12B capabilities without relying on cloud APIs. Ollama's support for agentic coding tools continues to expand these local options.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards โ


