NVIDIA Cosmos 3 takes top open weights rank with agentic reasoning

Artificial Analysis

Jun 1, 2026 · Updated Jun 12, 2026

NVIDIA's Cosmos 3 Super models have reached #1 on the Artificial Analysis open-weights leaderboards for both image and video generation. The system uses a reasoning-based architecture to refine prompts before generating high-fidelity visual content.

Artificial Analysis has ranked the Cosmos 3 release as the top open-weights model for both text-to-image and image-to-video tasks. The 64B Cosmos 3 Super uses a Mixture-of-Transformers architecture—combining a reasoner with a diffusion generator (a model that creates data by reversing a noise process). This unifies language, vision, and action.

Model: Cosmos 3 Super
Image2Video Elo: 1,255
Parameters: 64B
License: OpenMDW 1.1
Variants: Nano (16B) and Super (64B)

This unseats the HiDream-O1-Image-Dev-2604 analysis in the open-weights image category. While proprietary systems like the Grok-Imagine-Video-1.5 ranking maintain the overall lead, NVIDIA's release under the OpenMDW-1.1 license provides a high-performance alternative for local use. This performance tracks alongside the Nemotron 3 Ultra analysis, which leads open-weights intelligence benchmarks.

Weights and code are available on Hugging Face. The system uses agentic prompt-upsampling (using AI to expand simple instructions into detailed technical prompts) to handle the model's required JSON input format. First-party and third-party APIs are expected to launch in the coming weeks.

View the full update on artificialanalysis.ai

Artificial Analysis

@ArtificialAnlysJun 1

NVIDIA's Cosmos 3 lands at #1 among open weights models in both Text to Image and Image to Video on the Artificial Analysis Leaderboards! Cosmos 3 is a family of omnimodal world models for Physical AI from @nvidia, unifying language, image, video, audio and action in a single Mixture-of-Transformers architecture that pairs an autoregressive reasoner with a diffusion generator. The family comes in four variants: base Nano (16B: 8B reasoner tower + 8B generator tower) and Super (64B: 32B reasoner tower + 32B generator tower) models, with the Super model also having Text2Image and Image2Video fine-tuned variants, which are the versions listed in the Artificial Analysis Arena Leaderboards. Cosmos3-Super-Text2Image (agentic) runs through an agentic prompt-upsampling harness, and takes the #1 open weights spot in Text to Image, surpassing HiDream-O1-Image-Dev-2604, Alibaba's Qwen Image Max 2512 and Black Forest Labs' FLUX.2 [dev]. Cosmos3-Super-Image2Video takes #1 open weights in Image to Video (No Audio), ahead of Lightricks' LTX-2, and Alibaba's Wan 2.2 A14B. Cosmos 3 generators take structured JSON prompts rather than plain text, so prompt upsampling is needed to reproduce these results. This upsampling can be handled by an external harness or by the model's own reasoner branch, so it can also run self-contained. Cosmos 3 is fully open under the OpenMDW 1.1 license, shipping with weights, code, curated datasets and fine-tuning recipes available on @huggingface. First-party and third-party APIs are expected over the next few weeks, with pricing to follow. See the thread below for example generations and a link to try Cosmos 3 in our arena 🧵

11103

View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Artificial Analysis →

Keep reading

NVIDIA releases Cosmos 3 open weights to unify physical reasoning and world generation

NVIDIA released Cosmos 3, an open-weights foundation model family designed for physical AI applications like robotics and autonomous driving. By unifying vision reasoning and world simulation into a single architecture, the model allows developers to build autonomous systems that understand physical laws and predict future states within one workflow.

falJun 2

fal launches serverless inference for top ranked NVIDIA Cosmos 3 Super

fal has integrated NVIDIA's Cosmos 3 Super model into its serverless platform for image and video generation. The release provides a production-ready API for the highest-ranked open-weights world model without requiring custom GPU infrastructure.

Artificial Analysis Ranks Nemotron 3 Ultra Fastest for Agentic Tasks

Artificial AnalysisJun 4

Artificial Analysis Ranks Nemotron 3 Ultra Fastest for Agentic Tasks

Artificial Analysis evaluated NVIDIA's newly launched Nemotron 3 Ultra, finding it completes agentic tasks significantly faster than peers due to high inference speed. The model achieves competitive performance on Terminal-Bench v2.1, positioning it as a leading option for efficient autonomous AI workflows.

Arena.ai Adds Nemotron 3 Ultra to Agent Mode for Real-World Agent Evaluation

ArenaJun 5

Arena.ai Adds Nemotron 3 Ultra to Agent Mode for Real-World Agent Evaluation

Arena.ai has integrated NVIDIA's Nemotron 3 Ultra model into its Agent Mode, enabling users to run the model for complex, multi-step tasks. These sessions contribute to the new Agent Arena leaderboard, which evaluates agentic AI models on real-world performance using tools like web search and terminal. This expands the range of frontier models available for practical agentic workflows and provides new data for understanding their capabilities in autonomous tasks.