Member-only story
Six Important Answers From a Week of DeepSeek Questions
Calmer minds will prevail on both AI and its infrastructure
A week on from the soon-to-be-legendary market meltdown DeepSeek generated, the noise is subsiding. With more data coming to light, we can finally start to parse out what’s what, far from the money-people’s frenzy and the swanky headlines.
1. Is the 30x price reduction a fair assessment?
No. Many headlines have shared a $6m training cost figure for DeepSeek V3. This is wrong. The $6m does not include “costs associated with prior research and ablation experiments on architectures, algorithms and data”. The pre-training cost is a very narrow portion of the total cost. Excluded are important pieces of the puzzle like R&D and TCO of the hardware itself. Excluded are also potential subsidies from the Chinese state.
For reference, Claude 3.5 Sonnet cost $10s of millions to train, and if that was the total cost Anthropic needed, then they would not raise billions from Google and $10s of billions from Amazon.
$500M CAPEX is a more likely price according to SemiAnalysis, given the rumours of a 10k A100s cluster circling around High-Flyer, DeepSeek’s owners. High-Flyer and DeepSeek today often share resources, both human and…