Qwen 3.5 Vision Models Now Runnable Locally via Ollama

QwenQwen

· Updated

Qwen 3.5 vision models are now available locally via Ollama, with the 35B fitting on a 24GB GPU. All three models include built-in vision, expanded language support, and improved efficiency compared to previous Qwen releases.

Ollama added the Qwen 3.5 family to its local model library. The 35B model (MoE architecture) runs on systems with 24GB+ GPU memory — ollama run qwen3.5:35b. The 122B variant follows for higher-end setups, while the 397B remains cloud-only. All three include vision built-in, replacing the separate VL model variants from Qwen 2.5.

The shift to native vision is significant for local AI workflows - you no longer need a separate vision model or a cloud API to handle images and documents alongside text. Qwen 3.5 also adds broader multilingual support and more efficient inference, making it a direct upgrade path for anyone running Qwen locally.

For 24GB GPU owners, ollama run qwen3.5:35b is the fastest way to try it. The 122B model is available for workstations with more VRAM.

ollama
ollama
@ollama
X

Qwen 3.5 family is here! > vision built-in, and can outperform previous VL models > designed to be more efficient > expanded support for more languages 35B: (fits on 24GB+ system) ollama run qwen3.5:35b 122B: ollama run qwen3.5:122b 397B (cloud only): ollama run https://t.co/JIobYmBgCj

97retweets
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update