Anthropic Accuses Chinese AI Firms of Large-Scale Model Distillation

News Context

At a glance

Anthropic, a leading artificial intelligence startup, accused three Chinese AI companies – DeepSeek, Moonshot, and MiniMax – of engaging in “industrial-scale campaigns” to illicitly extract capabilities from its...
Distillation itself isn’t a novel or inherently malicious practice.
The core of the issue lies in how these Chinese companies allegedly bypassed access controls and terms of service.

Anthropic, a leading artificial intelligence startup, accused three Chinese AI companies – DeepSeek, Moonshot, and MiniMax – of engaging in “industrial-scale campaigns” to illicitly extract capabilities from its Claude chatbot. The alleged method involves a technique called “distillation,” where smaller models are trained on the outputs of more powerful ones, and the scale of the operation, involving over 16 million exchanges generated through approximately 24,000 fraudulent accounts, has raised concerns about intellectual property theft and national security.

Distillation itself isn’t a novel or inherently malicious practice. AI labs routinely use it to create smaller, more efficient versions of their models for specific applications or to reduce computational costs. However, Anthropic argues that the coordinated and large-scale nature of these campaigns, coupled with the actors involved, crosses a line. The company’s statement highlights that illicitly distilled models may lack the safety safeguards built into the original, potentially enabling the development of AI systems for dangerous purposes, such as bioweapons or malicious cyber activities.

The core of the issue lies in how these Chinese companies allegedly bypassed access controls and terms of service. According to Anthropic, the labs created what they term “hydra clusters” – networks of numerous fraudulent accounts operating through proxy setups to distribute traffic across Anthropic’s API and third-party cloud services. A single proxy was reportedly capable of controlling over 20,000 fake accounts simultaneously, blending extraction requests with legitimate usage to avoid detection. The patterns in these requests, characterized by high volumes, a narrow focus on specific capabilities, and repetitive prompts, were indicative of model training rather than typical user interaction.

Each of the three companies targeted different aspects of Claude’s capabilities. DeepSeek focused heavily on reasoning tasks, including rubric-based grading suitable for reinforcement learning and attempts to generate censorship-resistant responses to sensitive queries. Anthropic specifically noted DeepSeek’s efforts to elicit step-by-step reasoning, suggesting an attempt to reverse-engineer the training data used to develop Claude’s “chain-of-thought” capabilities. Moonshot, the developer of the Kimi models, concentrated on agentic reasoning, tool use, coding, data analysis, and computer vision. MiniMax, meanwhile, conducted the largest campaign, targeting agentic coding and orchestration, and was observed adapting its training efforts in real-time after Anthropic released a new version of Claude.

This isn’t the first time concerns have been raised about similar practices. OpenAI also accused DeepSeek of using distillation techniques to train its models, though without the same level of detailed explanation provided by Anthropic. The situation highlights a broader tension within the AI industry, where companies are simultaneously reliant on vast datasets for training and concerned about the potential for intellectual property theft.

Anthropic is responding to these attacks on multiple fronts. The company is strengthening its defenses to make large-scale distillation more difficult to execute and easier to detect, deploying classifiers and behavioral fingerprinting systems to identify suspicious patterns in API traffic. They are also sharing technical indicators with other AI labs, cloud providers, and authorities, and tightening verification processes for accounts that might be used for fraudulent access. Anthropic is developing product-level safeguards to reduce the utility of its outputs for illicit training without impacting legitimate users.

However, Anthropic acknowledges that addressing this threat requires a coordinated effort. “Countering attacks at this scale requires coordinated industry and policy action,” the company stated. This suggests a need for broader collaboration among AI developers, governments, and potentially international organizations to establish clear guidelines and enforcement mechanisms regarding the use of distillation techniques and the protection of intellectual property in the rapidly evolving AI landscape. The incident underscores the growing importance of securing AI models not only against malicious use but also against the unauthorized extraction of their underlying capabilities.

Anthropic Accuses Chinese AI Firms of Large-Scale Model Distillation

Share this:

Related