NVIDIA Nemotron 3 Ultra is on Fireworks, day zero. Nemotron Ultra is an open model for frontier reasoning and orchestration in long-running autonomous agents. Think use cases like coding agents, deep research, and complex enterprise workflows. Read on: https://t.co/c8mdZwQp49 https://t.co/hQ4PJZ6mvM
Fireworks AI Adds NVIDIA Nemotron 3 Ultra for Agentic Reasoning
Fireworks AI· Updated
Fireworks AI now offers NVIDIA Nemotron 3 Ultra, an open model for advanced autonomous agents, with immediate deployment support. This provides developers with optimized infrastructure for long-running agentic tasks that require frontier reasoning and orchestration.
- Active Parameters
- 55B
- Agent Productivity PinchBench
- 91%
- Long-horizon Planning EnterpriseOps-Gym
- 33%
- Coding Terminal-Bench 2.0
- 54%
- Long Context Ruler @1M
- 95%
The model is optimized for complex, multi-step tasks like coding agents, deep research, and enterprise workflows, where the cost of completing an entire task, not just a single response, is critical. NVIDIA Nemotron 3 was introduced as part of a family of models for agentic AI.
NVIDIA reports Nemotron 3 Ultra achieves 5x faster inference (running a trained AI model to generate outputs) and up to 30% lower cost for agentic tasks compared to other open models in its class. Developers can deploy it on Fireworks AI using on-demand dedicated GPUs, billed by GPU-second.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →





