DeepSeek not ‘miracle,’ but impressive: Report debunks Chinese AI app's $5million claim

As the social media platforms and the stock markets are buzzed with the popularity of the new AI company DeepSeek, a report by Bernstein stated that DeepSeek looks fantastic but not a miracle and not built in $5 million.

Written By Riya R Alex
Published29 Jan 2025, 01:08 PM IST
A report mentioned that the claim of DeepSeek, which is comparable to ChatGPT by OpenAI, is built at a cost of $5 million is false.
A report mentioned that the claim of DeepSeek, which is comparable to ChatGPT by OpenAI, is built at a cost of $5 million is false.(AFP)

With the rising popularity of DeepSeek, a recent report by Bernstein stated that the Chinese AI app looks fantastic but is not a miracle, and it has not been built for $5 million.

The report mentioned that the claim of DeepSeek, which is comparable to ChatGPT by OpenAI, built at a cost of $5 million, is false.

"We believe that DeepSeek DID NOT "build OpenAI for $5M"; the models look fantastic, but we don't think they are miracles; and the resulting Twitter-verse panic over the weekend seems overblown," ANI reported, citing the Bernstein report.

“The models they built are fantastic, but they aren’t miracles either,” said Bernstein analyst Stacy Rasgon, who follows the semiconductor industry and was one of several stock analysts describing Wall Street’s reaction as overblown, reported Associated Press.

Also Read | Rahul Jacob: DeepSeek’s big shake-up of AI holds bigger policy lessons for us

The two main families of AI models, 'DeepSeek-V3' and ‘DeepSeek R1’, have been developed by the Chinese AI app.

The V3 model is a large language model that uses a mixture of expert (MOE) architecture. This architecture combines multiple smaller models to work together, resulting in high performance while using fewer resources than other large models. In total, the V3 model has 671 billion parameters with nearly 37 billion active users at a time.

This includes innovative techniques such as Multi-Head Latent Attention (MHLA), reducing memory usage, and mixed-precision training using FP8 computation for efficiency.

Also Read | US Navy bans use of China's DeepSeek due to ‘security and ethical concerns’

For the V3 model, DeepSeek used a cluster of 2,048 NVIDIA H800 GPUs for nearly two months, 2.7 million GPU hours for pre-training and 2.8 million GPU hours, including post-training.

According to estimates, the cost of this training will be nearly $5 million based on a $2 per GPU hour rental rate. The report claims that this amount doesn't account for other costs incurred for the development of the model.

Also Read | DeepSeek: Meet Liang Wenfeng, the mind behind the OpenAI competition

DeepSeek R1, which majorly competes with OpenAI models, is built on the V3 foundation but uses Reinforcement Learning (RL) and other techniques to improve reasoning capabilities.

The resources required for the R1 model were very substantial and were not accounted for by the company, the report said.

However, the report acknowledged that DeepSeek's models are impressive, but the panic and exaggerated claims about building an OpenAI competitor for $5 million are incorrect.

Catch all the Business News, Market News, Breaking News Events and Latest News Updates on Live Mint. Download The Mint News App to get Daily Market Updates.

Business NewsAIDeepSeek not ‘miracle,’ but impressive: Report debunks Chinese AI app's $5million claim
MoreLess