Member-only story
DeepSeek: Is It A Stolen ChatGPT?
While I was drowning in emails, fiddling around with Xcode and the Neural Cores in my MacBook, DeepSeek popped up on X and Reddit. It claims to be a Chinese LLM trained (in China?!) for the fraction of the training cost required by the current market leaders. Time to give it a try. Spoiler alert: something is just not right with it. There’s a fishy smell to it.
LLMs can easily be “censored” with output filters. Bypassing censorship is relatively easy by stretching the context window which will almost always have a negative effect on the underlying system prompts or instructions. However, what you can not bypass is a model explicitly trained on data that protects the underlying beliefs. Meaning it does not even have the data it should not show. DeepSeek however was obviously trained on almost identical data as ChatGPT, so identical they seem to be the same.
Besides the fact that you woulnd’t expect a “Chinese” LLM to go all out anti-communist when being fed anti-american communist propaganda, there are a ton of other signs that make you wonder: “Is this just a stolen ChatGPT?”. Let’s look at the evidence step by step.