NVIDIA Presents Its New Kind Of GPU Rubin CPX, For Inference In Broad Contexts

NVIDIA presents its new kind of GPU Rubin CPX, for inference in broad contexts

Last updated: 2025/09/10 at 5:32 AM

News Room Published 10 September 2025

Nvidia has presented its new Rubin CPX GPU Taking advantage of the event AI Infrastructure Summit. Designed for context windows greater than one million tokens, the GPU is optimized for the large context sequences, and designed to use as part of an infrastructure approach focused on what is known as disaggregated inference.

For users, the integration of this GPU will result in a higher performance in tasks with a broad context in the systems that carry it, especially in video generation and software development. As confirmed from the company, we will still have to wait a long time until its appearance, since it will arrive at the end of the next year 2026.

This GPU will be offered in the form of cards that can be integrated into server designs or discrete computers, which can work independently together with another hardware in data centers

The design is a derivative of the Rubin product line, which will arrive next year. According to Nvidia, it has been developed because it is necessary to carry out certain types of work with AI more efficiently. As they have stated from the company, some elements of the Inference area of IA computing, such as the process of generating requests to requests, are not as efficient as they should.

This is, apparently, to a chip, a type of GPU, is responsible for understanding the request for entry, and then creates and provides the answer to it. By separating the comprehension part of the response generation, what this new type of CPU can do, Nvidia believes that your customers will have a more efficient hardware.

This integrated NVIDIA MGX system has 8 Ex -Flops of AI Computing Power, 100 TB of fast memory and 1.7 pertabytes per second of memory bandwidth in a single rack. It has developed from Rubin architecture.

Use a monolithic matrix design with NVFP4 resources optimized to offer very high energy performance and efficiency, as needed for AI inference tasks. With regard to video and multimedia generation, the Nvidia Rubin CPX series chips will allow having systems capable of decoding, coding and processing in a single chip.

According to him CEO of Nvidia, Jensen Huang«The Vera Rubin platform will involve another great advance on the border of computer -based computer science, since it will introduce both the latest generation Rubin GPU and a new category of processors called CPX. Like RTX, it revolutionized the graphics and the physical AI, Rubin CPX is the first CUDA GPU specifically designed for the AI of massive context, in which the models reason from millions of knowledge tokens at the same time«