New Report Debunks DeepSeek’s Supposed Cost Advantage Over ChatGPT
A recent analysis by SemiAnalysis has challenged the widely circulated claim that DeepSeek’s development costs are just a fraction of what OpenAI spent on training ChatGPT. The report reveals that the supposed $6 million training cost for DeepSeek V3 is highly misleading, as it only accounts for GPU pre-training costs while excluding R&D, infrastructure, and other essential expenses.
The True Cost of DeepSeek’s AI Development
“Our analysis shows that the total server CapEx for DeepSeek is ~$1.6B, with a considerable cost of $944M associated with operating such clusters.” This contradicts previous claims that suggested DeepSeek was developing cutting-edge AI at a fraction of the cost of its Western counterparts.
The report further clarifies DeepSeek’s access to roughly 50,000 Hopper GPUs — but emphasizes that this does not mean 50,000 H100s, as some had assumed. Instead, the GPU fleet consists of a mix of:
• H100s
• H800s (a China-specific variant of the H100)
• H20s (a lower-performance model NVIDIA designed for the Chinese market in response to U.S. export controls)
DeepSeek operates its own data centers, allowing for a more streamlined structure than larger AI labs like OpenAI or Google DeepMind. However, this structure does not necessarily mean that its operations are dramatically cheaper.
DeepSeek’s Performance: Competitive but Not Dominant
SemiAnalysis also evaluated DeepSeek’s R1 model, finding that it matches OpenAI’s o1 model in reasoning tasks but does not dominate across all benchmarks. While DeepSeek has gained significant attention for its pricing and efficiency, Google’s Gemini Flash 2.0 was cited as a comparable model that offers similar performance at an even lower cost when accessed via API.
Key Innovation: Multi-Head Latent Attention (MLA)
A major highlight of DeepSeek’s technology is its Multi-Head Latent Attention (MLA) system, which dramatically reduces inference costs by cutting KV cache usage by 93.3%. This innovation improves efficiency and could significantly lower operational expenses. However, the report notes that any advancements DeepSeek makes will likely be quickly adopted by Western AI labs, limiting any long-term cost advantage.
Future Cost Reductions and Challenges
The report also suggests that AI training costs could drop another 5x by the end of the year, benefiting both DeepSeek and other AI labs. DeepSeek’s leaner structure enables it to move faster than larger, more bureaucratic competitors, but U.S. export restrictions on high-end GPUs remain a major obstacle to its future scaling efforts.
Conclusion
The idea that DeepSeek’s AI models cost just a fraction of ChatGPT’s development has been highly exaggerated. While the company has made meaningful strides in efficiency, its total infrastructure costs, GPU investments, and R&D expenses place it much closer to major Western AI labs than initial reports suggested.
With NVIDIA ($NVDA) continuing to dominate the AI hardware market, DeepSeek’s reliance on constrained Chinese GPU variants like the H800 and H20 could become a limiting factor in its growth. Meanwhile, its cost-saving innovations are unlikely to remain exclusive for long, as global AI leaders rapidly adapt new efficiency breakthroughs.