To improve chatbot performance, Nvidia plans to sell a new kind of processor, an LPU, optimized to run large language models (LLMs).
The “Nvidia Groq 3 LPU” chip was among seven upcoming chips Nvidia touted at the company’s annual GTC event, where it pitched the AI industry on why Nvidia’s chips continue to lead.
The LPU, or Language Processing Unit, comes from Nvidia’s deal this past December to license technology from a California AI company called Groq (not to be confused with the AI chatbot Grok from xAI). Founded in 2016, Groq issued earlier LPU chips specifically designed for LLMs to offer faster speeds and energy efficiency. The aim: To create an alternative to Nvidia’s enterprise GPUs, which can be used for a wider range of AI workloads.
Nvidia now wants to pair the newly revealed Groq 3 LPU with the rest of the company’s next-generation AI chips, dubbed the “Vera Rubin” platform, which includes the upcoming Rubin GPU and Vera CPU tech for data centers.
(Credit: Michael Kan)
Groq’s LPU chips use even faster SRAM (static RAM), instead of HBM (high-bandwidth memory) typically found on Nvidia’s GPUs. But on the downside, Groq’s LPUs can only offer “hundreds of megabytes” in SRAM, whereas HBM memory can span over a hundred gigabytes or more per chip.
That’s why a single Groq 3 LPU only contains 500MB of SRAM, while Nvidia’s upcoming Rubin GPU will feature 288GB of HBM4 memory. To compensate for the lower memory capacity, Nvidia is preparing to sell large batches of LPUs to work alongside the rest of its data center chips, giving AI companies a way to squeeze out even more performance.
Get Our Best Stories!
Your Daily Dose of Our Top Tech News
By clicking Sign Me Up, you confirm you are 16+ and agree to our Terms of Use and Privacy
Policy.
Thanks for signing up!
Your subscription has been confirmed. Keep an eye on your inbox!
Nvidia noted “the LPX rack with 256 LPU processors features 128GB of on-chip SRAM and 640TB/s of scale-up bandwidth. Deployed with Vera Rubin NVL72 (server unit), Rubin GPUs and LPUs boost decode by jointly computing every layer of the AI model for every output Token.”
(Credit: Michael Kan)
A data center could thus harness both the LPUs and Nvidia’s GPUs, dividing AI workloads between them to increase efficiency. Nvidia’s CEO, Jensen Huang, said the combined approach excels at helping AI companies boost performance with longer prompts.
(Credit: Michael Kan)
Combined, the LPUs and Rubin GPUs also promise to deliver up to a 35x increase in throughput when running a large language model with 1 trillion parameters, according to Nvidia’s benchmarks.
Recommended by Our Editors
“We’re in production with the Groq chip,” Huang said, adding that it’ll likely ship in Q3. Nvidia has contracted Samsung to manufacture the LPU. One analyst already expects Nvidia to ship out 4 to 5 million LPUs through 2026 and 2027.
(Credit: Michael Kan)
The new LPU and Vera Rubin systems will likely cost tens of thousands of dollars per chip, putting them far out of reach of consumers. Instead, expect the biggest AI companies, including OpenAI, Anthropic, and Meta, to adopt these technologies, which could power your chatbot queries or image-generation requests in the near future.
At GTC, Nvidia also talked up Vera Rubin, about which the company has gone into detail before, including at January’s CES, where the company revealed the Rubin chips were in “full production.” Nvidia plans on shipping the Vera Rubin-related chips, including the new LPU chip, in this year’s second half.
About Our Expert
Michael Kan
Senior Reporter
Experience
I’ve been a journalist for over 15 years. I got my start as a schools and cities reporter in Kansas City and joined PCMag in 2017, where I cover satellite internet services, cybersecurity, PC hardware, and more. I’m currently based in San Francisco, but previously spent over five years in China, covering the country’s technology sector.
Since 2020, I’ve covered the launch and explosive growth of SpaceX’s Starlink satellite internet service, writing 600+ stories on availability and feature launches, but also the regulatory battles over the expansion of satellite constellations, fights with rival providers like AST SpaceMobile and Amazon, and the effort to expand into satellite-based mobile service. I’ve combed through FCC filings for the latest news and driven to remote corners of California to test Starlink’s cellular service.
I also cover cyber threats, from ransomware gangs to the emergence of AI-based malware. Earlier this year, the FTC forced Avast to pay consumers $16.5 million for secretly harvesting and selling their personal information to third-party clients, as revealed in my joint investigation with Motherboard.
I also cover the PC graphics card market. Pandemic-era shortages led me to camp out in front of a Best Buy to get an RTX 3000. I’m now following how President Trump’s tariffs will affect the industry. I’m always eager to learn more, so please jump in the comments with feedback and send me tips.
Read Full Bio



