
Nvidia (NASDAQ: NVDA) stock is currently down 12% from its all-time high. It suffered a sharp sell-off in January after China-based start-up DeepSeek asserted that it had trained a competitive artificial intelligence (AI) model using a fraction of the computing power that had been deployed by leading U.S.-based developers like OpenAI.
Investors feared that DeepSeek’s techniques would be adopted by other AI developers, leading to a substantial drop in demand for Nvidia’s high-end graphics processing units (GPUs), which are the best hardware available for developing AI models. However, those concerns might have been overblown.
Google parent Alphabet (NASDAQ: GOOG)(NASDAQ: GOOGL) is a big buyer of Nvidia’s AI data center chips, and on Feb. 4, its CEO, Sundar Pichai, made some comments that should make Nvidia’s investors feel much better.
DeepSeek was established in 2023 by a successful Chinese hedge fund called High-Flyer, which had been using AI to build trading algorithms for years. DeepSeek released its V3 large language model (LLM) in December 2024, followed by its R1 reasoning model in January, and their competitiveness with some of the latest models from OpenAI and other start-ups got the tech sector buzzing.
Since DeepSeek’s work is open source, the industry quickly learned some important details. The start-up claims to have trained V3 for just $5.6 million (not including an estimated $500 million in chips and infrastructure, according to SemiAnalysis), which is a drop in the bucket compared to the tens of billions of dollars spent by companies like OpenAI to reach their current stage of development.
DeepSeek also used older generations of Nvidia’s GPUs like the H100, because the U.S. government banned the chip maker from selling its latest hardware to Chinese firms (to protect America’s AI leadership).
It turns out DeepSeek implemented some unique innovations on the software side to make up for the lack of computational power. It developed highly efficient algorithms and data input methods, and it also used a technique called distillation, which involves using the knowledge from an already-successful large AI model to train a smaller model.
In fact, OpenAI has accused DeepSeek of using its GPT-4o models to train DeepSeek R1, by prompting the ChatGPT chatbot at scale to “learn” from its outputs. Distillation rapidly speeds up the training process because the developer doesn’t have to collect or process mountains of data. As a result, it also requires far less computing power, which means fewer GPUs.