HeadsUpAI

NVIDIA simplifies local AI agent setup on DGX Spark with NemoClaw

The NVIDIA June 2026 release introduces a streamlined installation path for NemoClaw on DGX Spark workstations. A single nemoclaw.sh command now automates the setup of the OpenShell secure runtime, the OpenClaw agent harness, and local inference backends. This update targets the growing demand for autonomous agents that operate without external cloud dependencies.
Installer
nemoclaw.sh
Performance
2.6x throughput for Qwen3.6-35B
Max Cluster Memory
512GB across 4 nodes
Networking
200 Gbps RoCE via ConnectX-7
Default Model
Qwen3.6-35B

Performance optimizations for Qwen3.6-35B using NVFP4 quantization deliver a 2.6x throughput increase on DGX Spark. This allows agents to manage large context windows (the amount of data a model can process at once) more efficiently on local hardware. By shifting workloads to on-premise compute, teams can eliminate per-token API costs and keep sensitive data within their own infrastructure.

For scaling beyond a single device, the NVIDIA Sync cluster assistant now automates multi-node networking via ConnectX-7. This tool enables users to link up to four DGX Spark units into a high-bandwidth cluster with 512GB of unified memory. This configuration supports massive models and multi-agent pipelines that require significant distributed memory for local execution.

NVIDIA AI
NVIDIA AI
@NVIDIAAI
X

From unboxing to AI agent in minutes. Getting an agent running used to mean sourcing a model, configuring an inference backend, installing a runtime, and wiring it all together. The new NemoClaw install path on DGX Spark replaces that with a single command. DGX Spark also simplifies the path to local, long-running AI agents, cutting out external cloud dependencies and providing predictable on-premise compute.

1retweets2likes
View on X

Still wondering? A few quick answers below.

NemoClaw is an open-source software stack that simplifies the deployment of local AI agents. It integrates optimized open-weight models, agent harnesses like OpenClaw, and the OpenShell secure runtime. This allows developers to set up sandboxed, always-on assistants on their own hardware without complex manual configuration or cloud dependencies.

The June 2026 update provides a 2.6x throughput improvement for the Qwen3.6-35B model on DGX Spark. This is achieved through NVFP4 quantization and optimizations in the vLLM inference engine. These enhancements allow the workstation to handle the high compute demands of autonomous agents, such as maintaining large context windows.

Yes, the new cluster assistant in NVIDIA Sync automates the process of connecting two to four DGX Spark units. A four-node cluster provides 512GB of unified memory, which is sufficient for running massive models or complex multi-agent pipelines that exceed the capacity of a single workstation.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update