On January 20th, a Chinese company called “DeepSeek” released a new “reasoning” model, known as R1. DeepSeek claims that R1’s performance on several benchmark tests rivals that of the best US-developed models, and especially OpenAI’s o1 reasoning model, one of the large language models behind ChatGPT. And, more importantly, DeepSeek claims to have done it at a fraction of the cost of the US-made models.
At first glance, this looks like an indication that Chinese companies are gaining on their US competitors that have thus far maintained the lead in cutting-edge AI. But it’s not quite that simple.
There are three lenses through which we should look at the DeepSeek development: (1) The tech competition between the US and China; (2) the debate over proprietary, high-cost AI development and open-source, lower-cost AI development; (3) and the geopolitical competition between the US and China.
Tech competition between the US and China
First, this development—a Chinese company having built a model that rivals the best US models—does make it look like China is closing the technology gap between itself and the US with respect to generative AI. And that’s true… sort of.
DeepSeek developed R1 using a technique called “distillation.” Without going into too much detail here, distillation allows developers to train a smaller (and cheaper) model by using either the output data or the probability distribution of a larger model to train or tune the smaller one. The DeepSeek developers published an Arxiv paper that goes into greater detail on the techniques they developed to create R1.
The point here is that R1 is derivative of larger models. DeepSeek could not have developed R1 without using the larger, more expensive US-developed large language models.
So, if we’re looking strictly through the lens of technological competition between the US and China, DeepSeek’s R1 does not signal that Chinese companies are ahead of US companies.
But that doesn’t mean the R1 development isn’t significant! It’s especially significant if we look through the second lens—the debate over high-dollar, proprietary models vs. lower-cost, open-source models.
Proprietary models vs. open source models
DeepSeek showed that, given a high-performing generative AI model like OpenAI’s o1, fast-followers can develop open-source models that mimic the high-end performance quickly and at a fraction of the cost.
No one knows exactly how much the big American AI companies (OpenAI, Google, and Anthropic) spent to develop their highest performing models, but according to reporting Google invested between $30 million and $191 million to train Gemini and OpenAI invested between $41 million and $78 million to train GPT-4. DeepSeek, by contrast, claims that it was able to achieve similar capabilities with just $5.6 million (and without the cutting edge chips that the US CHIPS Act has prevented China from buying).
This development is going to have inherent impacts on the incentives driving investment in AI. Up to this point, the big AI companies were willing to invest billions into infrastructure to enable marginal advantages over their competitors. If it is now possible—as DeepSeek has demonstrated—that smaller, less well funded competitors can follow close behind, delivering similar performance at a fraction of the cost, those smaller companies will naturally peel customers away from the big three. And that, in turn, will affect the larger companies’ willingness to invest in infrastructure. Time will tell.
Strategic competition between the US and China
Finally, the third lens is the strategic competition between the US and China. Here is where it gets most interesting.
OpenAI claims that DeepSeek violated its terms of service by using OpenAI’s o1 model to distill R1. On the one hand, some have argued that DeepSeek is only asking OpenAI how its own medicine tastes. OpenAI’s models, after all, have been trained on publicly available data, including intellectual property that rightfully belongs to creators other than OpenAI. On the other hand, China has a long history of stealing US intellectual property—a trend that US leaders have long recognized has had a significant impact on the US.
DeepSeek didn’t need to hack into any servers or steal any documents to train their R1 model using OpenAI’s model. They just needed to violate OpenAI’s terms of service. Many AI companies include in the terms of service restrictions against using distillation to create competitor models, and violating those terms of service is a lot easier than other methods of stealing intellectual property. But it does fit into a broader trend according to which Chinese companies are willing to use US technology development as a jumping-off point for their own research.
On this issue, I commend to you (with a caveat) Kai-Fu Lee’s book, AI Superpowers: China, Silicon Valley, and The New World Order. The caveat is this: Lee claims in the book to be an honest broker—someone who has seen tech development from the inside of both Silicon Valley and Shenzhen. This may or may not be true. Regardless, the book reads like an advertisement for the PRC’s approach to tech development.
In that book, Lee argues that one of the crucial elements of China’s entrepreneurial sector is the lack of protection of intellectual property. Unlike in the US, Lee argues, in China there are no patents, or copyrights—no protected trademarks or licensing rights.
In other words, if a Chinese entrepreneur is first-to-market with a new product or idea, there is nothing—nothing but sweat and grind—to prevent a sea of competitors from stealing the idea and running with it. As Lee argues, this is a benefit of the Chinese system because it makes Chinese entrepreneurs stronger. Like the saying about living in New York, entrepreneurs who survive in the Chinese market can survive anywhere.
Given this background, it comes as no surprise at all that DeepSeek would violate OpenAI’s terms of service to produce a competitor model with similar performance at a lower training cost. According to Lee, that’s the Chinese way.
So significant is R1’s reliance on OpenAI’s system that in this CNBC coverage, the reporter asks DeepSeek’s R1 “What model are you?” R1 responds, “I’m an AI language model called ChatGPT, developed by OpenAI.”
DeepSeek’s model does seem to conform to Chinese Communist Party sensitivities. The internet has proliferated with videos of users asking R1 questions about Tiananmen Square, Taiwan, and whether President Xi looks like Winnie the Pooh. In typical responses, the model provides an answer—ever so briefly—before replacing it with a claim about how it is “not sure how to approach this type of question.”
One final thought as we consider the strategic competition between the US and China. DeepSeek released its R1 model that rivals the best American models on January 20th—inauguration day.
Is that a coincidence? Maybe. Maybe not.