By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: Nvidia previews Rubin CPX graphics card for disaggregated inference – News
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > News > Nvidia previews Rubin CPX graphics card for disaggregated inference – News
News

Nvidia previews Rubin CPX graphics card for disaggregated inference – News

News Room
Last updated: 2025/09/09 at 7:48 PM
News Room Published 9 September 2025
Share
SHARE

Nvidia Corp. today previewed an upcoming chip, the Rubin CPX, that will power artificial intelligence appliances with 8 exaflops of performance.

AI inference involves two main steps. First, an AI model analyzes the information on which it will draw to answer the user’s prompt. Once the analysis is complete, the algorithm generates its prompt response one token at a time. Today, the two tasks are usually done using the same hardware. 

Nvidia plans to take a different approach with its future AI systems. Instead of performing both steps of the inference workflow using the same graphics card, it plans to assign each step to a different chip. The company calls this approach disaggregated inference.

Nvidia’s upcoming Rubin CPX chip is optimized for the initial, so-called context phase of the two-step inference workflow. The company will use it to power a rack-scale system called the Vera Rubin NVL144 CPX (pictured.) Each appliance will combine 144 Rubin CPX chips with 144 Rubin GPUs, upcoming processors optimized for both phases of the inference workflow. The accelerators will be supported by 36 central processing units.

The company says the upcoming system will provide 8 exaflops of computing capacity. One exaflop corresponds to a quintillion computing operations per second. That’s more than seven times the performance of the top-end GB300 NVL72 appliances currently sold by Nvidia.

Under the hood, the Rubin CPX is based on a monolithic die design with 128 gigabytes of integrated GDDR7 memory. Nvidia also included components optimized to run the attention mechanism of large language models.

An LLM’s attention mechanism enables it to identify and prioritize the most important parts of the text snippet it’s processing. According to Nvidia, the Rubin CPX can perform the task three times faster than its current-generation silicon. “We’ve tripled down on the attention processing,” said Ian Buck, Nvidia’s vice president of hyperscale and high-performance computing.

The executive detailed that video processing workloads will receive a speed boost as well. The Rubin CPX includes hardware-level support for video encoding and decoding. That’s the process of compressing a clip before it’s transmitted over the network to save bandwidth and then restoring the original file.

According to Nvidia, the Rubin CPX will enable AI models to process prompts with one million tokens’ worth of data. That corresponds to tens of thousands of lines of code or one hour of video. In many cases, increasing the amount of data an AI model can consider while generating a prompt response boosts its output quality. 

Nvidia plans to start shipping the Rubin CPX at the end of 2026. 

Image: Nvidia

Support our mission to keep content open and free by engaging with theCUBE community. Join theCUBE’s Alumni Trust Network, where technology leaders connect, share intelligence and create opportunities.

  • 15M+ viewers of theCUBE videos, powering conversations across AI, cloud, cybersecurity and more
  • 11.4k+ theCUBE alumni — Connect with more than 11,400 tech and business leaders shaping the future through a unique trusted-based network.

About News Media

News Media is a recognized leader in digital media innovation, uniting breakthrough technology, strategic insights and real-time audience engagement. As the parent company of News, theCUBE Network, theCUBE Research, CUBE365, theCUBE AI and theCUBE SuperStudios — with flagship locations in Silicon Valley and the New York Stock Exchange — News Media operates at the intersection of media, technology and AI.

Founded by tech visionaries John Furrier and Dave Vellante, News Media has built a dynamic ecosystem of industry-leading digital media brands that reach 15+ million elite tech professionals. Our new proprietary theCUBE AI Video Cloud is breaking ground in audience interaction, leveraging theCUBEai.com neural network to help technology companies make data-driven decisions and stay at the forefront of industry conversations.

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Millions can slash £5 off bill at 100s of restaurants – & bag pizza for a fiver
Next Article Every product announced at Apple Event 2025 including iPhone 17
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

3 low-key iPhone 17 keynote announcements I somehow missed during the keynote
News
Here’s everything you need to know about preordering the 2025 iPhone lineup
News
Episode 222: The Secret Star of Apple’s iPhone 17 Event (Hint: It’s NOT the iPhone)
News
BEYOND Expo 2025: WaveSpeedAI boosts multimodal AI speeds, cuts costs · TechNode
Computing

You Might also Like

News

3 low-key iPhone 17 keynote announcements I somehow missed during the keynote

5 Min Read
News

Here’s everything you need to know about preordering the 2025 iPhone lineup

8 Min Read
News

Episode 222: The Secret Star of Apple’s iPhone 17 Event (Hint: It’s NOT the iPhone)

0 Min Read
News

Google’s Pixels rocket back into the global top 5 premium smartphones

2 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?