A Single Demo Won Jensen's $20B in Three Weeks
Nvidia spent $20 billion on a "why not." That was enough. At GTC2026 in San Jose, Jonathan Ross — Groq's CEO who is now also Nvidia's chief software architect — told the origin story of one of the largest deals in semiconductor history.

image from GPT Image 1.5
Nvidia's $20 billion acquisition of Groq's inference technology originated from a simple "why not" from Jensen Huang, after Sunny Madra proposed opening NVLink to Groq. A proof-of-concept demo disaggregating LLM inference between Nvidia GPUs and Groq's SRAM-based LPUs was presented to Huang, leading to a deal signed within three weeks. The combined architecture now sits in Nvidia's AI factory as the Groq 3 LPX Rack, offering premium fast token processing (200-400 tokens per second per user) alongside cost-effective slow tokens, delivering up to 35 times higher inference throughput per megawatt compared to Vera Rubin alone.
- •Nvidia's $20B Groq deal closed in three weeks after a single proof-of-concept demo combined Nvidia GPUs with Groq's fast SRAM-based LPUs.
- •The disaggregated architecture leverages silicon physics: Groq's LPUs excel at fast token generation while Nvidia's GPUs provide high aggregate throughput, enabling a premium tier for high-interactivity AI workloads.
- •Nvidia's GTC keynote projected the combined system could drive close to $300 billion in annual revenue per gigawatt for customers—a keynote projection, not audited.

