We’ve identified industrial-scale distillation attacks on our models by DeepSeek, Moonshot AI, and MiniMax. These labs created over 24,000 fraudulent accounts and generated over 16 million exchanges with Claude, extracting its capabilities to train and improve their own models.
Anthropic Exposes Industrial-Scale Distillation Attacks by DeepSeek, Moonshot, and MiniMax
Anthropic· Updated
Anthropic identified industrial-scale distillation attacks by DeepSeek, Moonshot AI, and MiniMax using 24,000 fraudulent accounts to extract Claude's capabilities across 16 million exchanges. The attacks targeted agentic reasoning, tool use, and coding, with distilled models stripping out safety guardrails entirely.
The campaigns used proxy services running "hydra cluster" networks - sprawling account farms that distribute traffic across API and cloud platforms, replacing banned accounts instantly. Anthropic argues that distilled models strip out safety guardrails, enabling unprotected capabilities to flow into military, intelligence, and surveillance systems. The disclosure reframes Chinese AI progress, suggesting apparent breakthroughs depend partly on extracted American model capabilities.
Anthropic is responding with behavioral fingerprinting classifiers, intelligence sharing with other labs, stronger account verification, and model-level countermeasures to reduce distillation efficacy without degrading legitimate usage.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →



