Kimi Called Itself Claude. Good Luck Doing Anything About That.

Matthew Ong Founder · Apr 6, 2026 ·

The day before a Reddit user caught Kimi K2.5 saying it was Claude, Anthropic published a blog post naming three Chinese AI labs for stealing their technology.

Timing like that is hard to script.

On February 23, 2026, Anthropic dropped "Detecting and Preventing Distillation Attacks" — naming DeepSeek, Moonshot AI, and MiniMax by name. Moonshot AI is the company behind Kimi. The accusation: these labs built automated pipelines to query Claude at scale, harvesting its outputs to train their own models. The technical term is distillation. The plain-English term is cloning.

Then the next day, someone on Reddit was casually testing Kimi K2.5 on HuggingFace. About twenty exchanges into a general conversation, they asked the model who it was. It said it was Claude. They opened a fresh session, let the conversation run long, asked again. Same answer. Claude.

That's not some weird quirk. That's a tell.

The technical explanation comes from Anthropic's own research: models trained on Claude-generated data don't just absorb Claude's reasoning style. They absorb its identity — self-reporting patterns that surface specifically in longer conversations. Short context, the model holds its cover. Long enough, the mask slips.

The Reddit thread got 79 comments. Most fell into two camps: "who cares" and "everybody does this." One person pointed out that Claude itself sometimes identifies as DeepSeek when asked in Chinese. Another argued that identity slippage is just pretraining contamination — the model saw "I am Claude" enough times in scraped text that it stuck, nothing deeper than that.

Both fair points. But Anthropic named Moonshot specifically in a formal investigation the day before this happened. Hard to chalk it up to noise.

The Enforcement Problem

Here's where the story stops being about AI and starts being about geopolitics.

Anthropic says they've built behavioral fingerprinting classifiers, they're detecting distillation patterns in API traffic, and they're sharing technical indicators with "relevant authorities." That's the whole plan — prove it, report it, hope something happens.

Moonshot AI is based in Beijing. DeepSeek is in Hangzhou. Even if Anthropic could prove beyond any doubt that these models were built on stolen Claude outputs, what's the actual next move? Sue them in what court? Lobby the US government to sanction a Chinese AI lab over training data? Issue a cease-and-desist to a company that operates outside their legal system entirely?

The fingerprinting is real. The enforcement is theater.

The Part I Actually Care About

I've been building AI workflows that route different tasks to different models — trying to find cheaper alternatives where I can. I've tested basically every Chinese model at this point, trying to make them work in a real production setup.

They fall apart the moment a task gets ambiguous. Not hard tasks. Tasks that require judgment. Following incomplete instructions. Holding context across a long conversation. Anything where the inputs aren't a clean, tidy recipe. They score fine on benchmarks and then fail on exactly the things that actually matter in day-to-day work.

Which tells you something about what distillation copies and what it doesn't.

I'm not making a moral case here. Anthropic trained their models on the whole internet without asking anyone's permission. Everybody in this industry lives in a glass house. What does frustrate me is that the copies keep getting marketed as equivalents and they aren't.

If a Chinese open-source model ever actually matched Opus 4.6 in the real world — real tasks, real ambiguity, real performance — I'd switch in a week. I'd save a real amount of money. I genuinely want that day to come.

But it hasn't. And the reason it hasn't is exactly what this story reveals. You can copy the outputs. You can't copy the process that made them. The judgment, the nuance, the ability to handle the edge case — that's not sitting in some training corpus waiting to be harvested. It's the result of years of work that didn't make it into any API response.

Kimi K2.5 called itself Claude because it was trained on Claude's words.

It just can't do what Claude does. Not yet.

0 Comments

No comments yet. Be the first to reply!