
Nvidia (NASDAQ: NVDA) supplies some of the world’s most advanced graphics processing units (GPUs) for data centers — hardware that developers use to power and train artificial intelligence (AI) software. Demand for its chips far exceeds what it can currently supply, which helps explain how the company has added over $2.3 trillion to its market capitalization since the start of 2023.
At Nvidia’s annual GPU Technology Conference (GTC) last month, CEO Jensen Huang laid out some incredible catalysts that could accelerate the company’s already rapid growth. With its stock currently trading down 27% from its record high amid the sharp sell-off in the broader market, this could be a significant buying opportunity.
Large language models (LLMs) sit at the foundation of every AI application. These models are trained on mountains of data, and the more data an LLM can access, the “smarter” the resulting tool will be. However, training them requires massive amounts of computing power — particularly parallel processing power — which is why there is so much demand for Nvidia’s data center GPUs.
Up until recently, LLMs delivered “one-shot” responses, meaning a chatbot would rapidly generate a single output for every prompt input by the user. While this method was fast and effective, it failed to weed out inaccuracies, which detracted from their value and the user experience. Now, top developers like OpenAI, Anthropic, and DeepSeek are focusing on an entirely different approach called test-time scaling, or “reasoning.”
Rather than simply ingesting endless amounts of data, these models spend more time “thinking” before rendering responses to inputs. In other words, they make better use of the data they already have, and are more apt to clear up any inaccuracies behind the scenes before releasing the final output. This approach has been wildly successful, producing some of the most advanced AI models to date, such as OpenAI’s GPT-4o series, DeepSeek’s R1, Anthropic’s Claude 3.7 Sonnet, and Alphabet‘s Gemini 2.5 Pro.
However, reasoning models require significantly more computing power. Huang says each response consumes 10 times more tokens (words, punctuation, and symbols) because of how much “thinking” goes on in the background, and as a result, the models are also much slower to render a final output. Huang says GPUs will need to be 10 times faster to offset this, and he estimates that developers will soon need a staggering 100 times more computing power to deploy reasoning models with a satisfactory user experience.