DeepSeek Reveals R1 AI Model Cost Just $294,000 to Train

Chinese AI developer DeepSeek has disclosed that its popular R1 artificial intelligence model cost just $294,000 to train, a figure dramatically lower than estimates for competing US models. The revelation, published in a peer-reviewed Nature journal article, marks the first time the Hangzhou-based company has shared detailed training cost information.

Training Infrastructure and Methodology

The disclosure shows DeepSeek used 512 Nvidia H800 chips to train the R1 model over 80 hours. These H800 processors were specifically designed for the Chinese market after US export restrictions in October 2022 banned the sale of more powerful H100 and A100 chips to China. However, DeepSeek acknowledged using A100 chips during preparatory development stages with smaller models.

The cost comparison highlights a significant gap in AI development expenses. OpenAI CEO Sam Altman previously stated that training foundational models costs “much more” than $100 million, though his company hasn’t provided specific figures for individual releases.

Addressing Distillation Controversy

DeepSeek also responded to allegations that it deliberately copied OpenAI’s models through a process called “distillation.” The company explained that its training data inadvertently contained OpenAI-generated content from web crawling, but insisted this was incidental rather than intentional. The technique allows newer AI systems to benefit from existing models’ investments without bearing the original development costs.

The January release of DeepSeek’s cost-effective AI systems caused global tech stock volatility as investors worried about threats to established AI leaders like Nvidia. This latest disclosure is likely to reignite discussions about China’s position in the global artificial intelligence race.

Training Infrastructure and Methodology

Addressing Distillation Controversy

Leave a Reply Cancel reply