China’s LLM Development

While OpenAI, Anthropic and Google’s LLMs grab most of the headlines in the US, a number of models from Chinese companies are making steady headways.

Here’s an incomplete list of these models and the companies that make them:

DeepSeek by DeepSeek
Qwen by Alibaba
Minimax by Minimax
Kimi by ByteDance

According to the benchmark, although the LLMs from the US frontier labs still dominate the leaderboard, these Chinese LLMs have decent performance and they are much cheaper. For example, Kimi K2 Thinking ranks #4 at High School Math, and #3 at Overall on Humanity’s Last Exam, while its input cost is $0.6 per million tokens and output cost is $2.5 per million tokens. In comparison, Claude Opus 4.7 input cost is $5 per million tokens and output cost is $25 per million tokens. In other words, the cost of these Chinese LLMs is only 1/10 while the performance is close to the frontier labs’ LLMs.

While the performance and costs are impressive, what’s even more surprising is there are so many of them. As we all know, training a LLM is quite expensive, it’s beyond the budget for most organizations to train their own LLMs. Even when we ignore the GPU sanction issues Chinese companies face, how can they afford to train these LLMs?

Here’s a comparison of these Chinese companies’ valuation:

Company Name	Valuation (in USD)
Alibaba	~$300B
ByteDance	~$600B
DeepSeek	$45B - $50B
Minimax	~$34B
IBM	$280B
Adobe	$100B
Oracle	$649B
SAP	$200B

Notice that although Alibaba and ByteDance are huge companies, they are on the scale of IBM & Oracle in terms of market cap, yet neither IBM nor Oracle has trained any well known LLMs, so how can Alibaba & ByteDance afford to do so, but not IBM or Oracle?

One may speculate that the Chinese government has provided subsidies since AI is one of the high priority items for them.

Another possibility is that the US companies have strategically avoided competing against the established players in this space. Oracle is generating significant revenue from Oracle Cloud Infrastructure, so there’s no point of competing against its best customers. This may be the same for Microsoft, but it doesn’t explain why Google is training its model and providing infrastructure through Google Cloud to the frontier labs.

Another possibility is that because it’s difficult for Chinese users to access the US LLMs due to network restrictions, there’s a unique market opportunity for the Chinese companies, and they are willing to invest in LLMs. The same cannot be said for non-Chinese companies, that’s why there are so many Chinese companies competing for the Chinese market while non-Chinese companies are not willing to compete against the existing players.