HeadsUpAI

OpenClaw and NVIDIA release security dataset for 67,000 agent skills

OpenClaw and NVIDIA released a security dataset covering 67,453 agent skills on ClawHub. It introduces NVIDIA SkillSpector, a scanner for agentic risk—hidden instructions or overbroad capabilities. Every skill now includes a Skill Card documenting verified provenance and scan results rather than publisher descriptions.
Dataset size
67,453 skills
Malicious rate
0.31%
Agentic risk rate
48.71%
Max scanner agreement
8.5%
Verification model
GPT-5.5

Data shows scanners rarely agree, matching on only 8.5% of risks. While malware is rare, nearly half of skills were flagged for agentic risk. This extends the NVIDIA NemoClaw initiative to move beyond code analysis toward semantic verification, catching risks that standard virus scanners miss.

Access the dataset on Hugging Face to benchmark security tools or audit agent deployments. This release fulfills the OpenClaw security roadmap for standardizing plugin provenance. The ClawScan pipeline, using GPT-5.5 to weigh signals, is now the default verification gate for all new skills published to the registry.

OpenClaw🦞
OpenClaw🦞
@openclaw
X

In collaboration with @nvidia, we’re open-sourcing a dataset of security scans for 67,453 ClawHub skills on @huggingface: - NVIDIA SkillSpector flagged 1/2 for agentic risk - Only 0.31% were malicious - No two scanners agreed on more than 8.5% of risks https://t.co/ml624ExiLG

73retweets602likes
View on X

Still wondering? A few quick answers below.

NVIDIA SkillSpector is a security scanner that uses AI-assisted semantic analysis to identify agentic risks in AI agent skills. Unlike traditional malware scanners, it detects hidden instructions, risky code paths, and mismatches between a skill's declared purpose and its actual behavior.

A Skill Card is a verified trust artifact that accompanies every skill on the ClawHub registry. It documents the skill's publisher, capabilities, and security scan results. These cards are generated by the ClawScan pipeline to ensure users have verified information before installing a skill.

ClawScan is a verification pipeline that acts as an LLM-as-judge. It takes inputs from three independent scanners—VirusTotal, static analysis, and NVIDIA SkillSpector—and uses GPT-5.5 to weigh the conflicting signals. It then produces a final verdict of Clean, Suspicious, or Malicious for each skill.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update