ComfyUI Integrates Google Gemma 4 and Netflix VOID for Multimodal Video Workflows

ComfyUI

May 15, 2026 · Updated Jun 13, 2026

ComfyUI natively integrated Google's Gemma 4 multimodal model and Netflix's VOID video inpainting tool into its node-based orchestration platform. This update allows users to combine frontier-class reasoning with professional-grade video object removal that erases complex artifacts like shadows and reflections. By bringing these open-source models into a visual workflow, creators can now build automated pipelines for sophisticated media editing and analysis.

ComfyUI, a node-based AI orchestration platform, natively integrated three new open-source models. The update adds Google's Gemma 4, a multimodal LLM (a model processing text, images, and video) with a reasoning mode, alongside VOID from Netflix and the BiRefNet background removal utility.

Gemma 4 inputs: Text, image, audio, and video
VOID capabilities: Video object, shadow, and reflection removal
BiRefNet capability: Background removal
Availability: Native ComfyUI nodes and cloud templates
Gemma 4 feature: Built-in step-by-step reasoning mode

This integration shifts ComfyUI from image generation into a hub for multimodal reasoning. While VOID provides professional-grade video object removal that handles shadows and reflections, Gemma 4 allows the system to analyze content. This mirrors the industry-wide move toward high-performance agentic workflows that combine reasoning with specialized media tools.

You can now deploy these models through cloud templates or local nodes to automate complex editing. Gemma 4 can analyze video frames to guide generation, while VOID enables seamless object erasure. These tools are available as open-source integrations within the ComfyUI interface for local and cloud-based execution.

View the full update on blog.comfy.org

ComfyUI

@ComfyUIMay 15

Three new open-source models just landed in ComfyUI natively: → Gemma 4 (Google DeepMind) - multimodal LLM handling text, image, audio, and video input with built-in step-by-step reasoning mode → VOID (Netflix) - video object removal that also erases shadows, reflections, and https://t.co/K1cTS7ECCg

63594

View on X

Still wondering? A few quick answers below.

Google Gemma 4 is a multimodal large language model developed by Google that is now natively supported in ComfyUI. It can process text, image, audio, and video inputs simultaneously. A key feature is its built-in step-by-step reasoning mode, which allows the model to perform complex logical tasks and analysis within a visual workflow.

VOID is a specialized video inpainting model developed by Netflix that is now integrated into ComfyUI. Unlike standard object removal tools, VOID is designed to erase not just the target object but also the associated shadows and reflections it casts. This results in much cleaner video edits that maintain visual consistency and realism across frames.

Yes, the new models integrated into ComfyUI are open source. This includes Google's Gemma 4, Netflix's VOID video inpainting tool, and the BiRefNet background removal utility. Because they are open source, developers and creators can run them locally within their own ComfyUI environments or use official cloud templates for immediate testing and deployment.

BiRefNet is a specialized utility model integrated into ComfyUI for the purpose of removing backgrounds from images. It is designed to provide high-quality subject isolation, which is a critical step in many generative AI pipelines. Users can access this capability natively through ComfyUI nodes or by using the pre-configured cloud templates provided in the official announcement.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Keep reading

ComfyUI Adds OpenRouter for Unified Access to Frontier Creative Models

ComfyUI launched an official OpenRouter LLM partner node, enabling direct access to over 20 frontier and open-weight models within its visual orchestration platform. The integration dynamically reconfigures its interface based on model capabilities, allowing creators to swap between vision, reasoning, and web-grounded models without rebuilding workflows.

Google launches Gemma 4 12B with native audio for laptops

GoogleJun 4

Google launches Gemma 4 12B with native audio for laptops

Google released Gemma 4 12B, a unified multimodal model that processes audio and vision directly within the LLM backbone. It brings near-frontier reasoning to consumer hardware, enabling complex agentic workflows to run entirely offline on standard laptops.

Google DeepMindMay 20

Google Flow Adds Agentic Editing and Character Consistency via Gemini Omni

Google updated its Flow creative studios with Gemini Omni Flash to enable precise video editing and stable character identities across scenes. By introducing an autonomous agent for batch editing and natural language tool creation, Google is shifting AI video from single-clip generation to a managed production workflow.

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

OllamaJun 7

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

Ollama has made Google DeepMind's Gemma 4 12B model available for local execution, including support for chat and agentic applications. This expands access to a powerful, open-weight multimodal model optimized for on-device reasoning and coding, enabling private and offline AI workflows on consumer hardware.

What is Google Gemma 4 in ComfyUI?

What does the Netflix VOID model do?

Are the new Gemma 4 and VOID models open source?

What is BiRefNet in the latest ComfyUI update?

Keep reading

ComfyUI Adds OpenRouter for Unified Access to Frontier Creative Models

ComfyUI Adds OpenRouter for Unified Access to Frontier Creative Models

Google launches Gemma 4 12B with native audio for laptops

Google launches Gemma 4 12B with native audio for laptops

Google Flow Adds Agentic Editing and Character Consistency via Gemini Omni

Google Flow Adds Agentic Editing and Character Consistency via Gemini Omni

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

Keep reading

ComfyUI Adds OpenRouter for Unified Access to Frontier Creative Models

ComfyUI Adds OpenRouter for Unified Access to Frontier Creative Models

Google launches Gemma 4 12B with native audio for laptops

Google launches Gemma 4 12B with native audio for laptops

Google Flow Adds Agentic Editing and Character Consistency via Gemini Omni

Google Flow Adds Agentic Editing and Character Consistency via Gemini Omni

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI

Ollama Adds Google DeepMind's Gemma 4 12B for Local Agentic AI