Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

OllamaOllama

Ollama has made Google DeepMind's Gemma 4 12B model available for local execution, including support for chat and agentic applications. This expands access to a powerful, open-weight multimodal model optimized for on-device reasoning and coding, enabling private and offline AI workflows on consumer hardware.

Ollama has added Google DeepMind's Gemma 4 12B model to its library, enabling local execution of the multimodal model. The integration supports chat and agentic tools like Hermes Agent and Claude Code. The gemma4:12b-mlx variant was initially available via MLX, with the gemma4:12b model now accessible across all platforms.
Local Size
7.6GB
Variants
gemma4:12b, gemma4:12b-mlx
Agentic Apps
Hermes Agent, Claude Code, and more
Access
ollama run (chat), ollama launch (agents)

The Gemma 4 family of models, built by Google DeepMind, is designed for frontier-level performance in reasoning, agentic workflows, coding, and multimodal understanding. These models are optimized for efficient local execution on consumer hardware, bringing advanced capabilities like native function-calling and system prompt support directly to users' machines. This aligns with a broader trend of making powerful AI models available for on-device, private, and offline use.

The Gemma 4 12B model can be run locally via Ollama using simple commands, allowing for direct chat interactions or integration into agentic tools. This enables developers and users to build and experiment with sophisticated AI applications that leverage Google's Gemma 4 12B capabilities without relying on cloud APIs. Ollama's support for agentic coding tools continues to expand these local options.

Gemma 4 12B and 26B performance benchmarks across eight key evaluation metrics compared to Gemma 3 27B.
ollama
ollama
@ollama
X

.@GoogleDeepMind's Gemma 4 - 12B is available on Ollama! Chat: ollama run gemma4:12b-mlx Hermes Agent: ollama launch hermes --model gemma4:12b-mlx Claude Code: ollama launch claude --model gemma4:12b-mlx and more ๐Ÿ‘‡๐Ÿ‘‡๐Ÿ‘‡ (Note, this currently works via MLX) https://t.co/BWmHT9w33m

138retweets1.3klikes
View on X

Still wondering? A few quick answers below.

Gemma 4 12B is a multimodal model from Google DeepMind designed for reasoning, agentic workflows, coding, and multimodal understanding. It processes text and image inputs and generates text outputs, with configurable thinking modes for enhanced reasoning.

Gemma 4 models offer advanced reasoning, extended multimodality for text and image, diverse architectures including Mixture-of-Experts, and optimizations for on-device execution. They also feature increased context windows, enhanced coding and agentic capabilities with native function-calling, and native system prompt support.

You can run Gemma 4 12B locally using Ollama. After installing Ollama, you can use the command ollama run gemma4:12b for chat interactions or ollama launch [agent_name] --model gemma4:12b to integrate it with supported agentic applications like Claude Code or Hermes Agent.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards โ†’

Share this update