Nvidia launches Rubin CPX GPU for large-scale inferencing

Company says new chip great for generating video and code


Nvidia has announced a new GPU designed for large-scale inferencing tasks.


The company this week announced the Rubin CPX, a new class of GPU purpose-built for massive-context processing. The chip designer said the new GPU enables AI systems to handle million-token software coding and generative video faster and more efficiently.


The Rubin CPX delivers up to 30 petaflops of AI compute (NVFP4), and features 128GB of GDDR7 memory. The new chip is expected to be available at the end of 2026.


Available in multiple configurations, the Rubin CPX is set to be deployed alongside Vera CPUs and Rubin GPUs inside the new Nvidia Vera Rubin NVL144 CPX rack-scale platform.


The liquid-cooled integrated Nvidia MGX system offers eight exaflops of AI compute (NVFP4), which the company says will provide 7.5x more AI performance than GB300 NVL72 systems, as well as 100TB of fast memory and 1.7 petabytes per second of memory bandwidth in a single rack.


A dual-rack solution, comprising the NVL144 CPX rack and a ‘regular’ Vera Rubin NVL 144 rack, is also set to be available.


“The Vera Rubin platform will mark another leap in the frontier of AI computing — introducing both the next-generation Rubin GPU and a new category of processors called CPX,” said Nvidia founder and CEO Jensen Huang. “Just as RTX revolutionized graphics and physical AI, Rubin CPX is the first CUDA GPU purpose-built for massive-context AI, where models reason across millions of tokens of knowledge at once.”


Nvidia said the Rubin CPX integrates video decoders and encoders, as well as long-context inference processing, in a single chip, making it useful for long-format applications such as video search and high-quality generative video.


The company claims that for every $100 million invested, Vera Rubin NVL144 CPX can offer returns of $5 billion in token revenue.


Early adopters include coding tool Cursor, generative AI firm Runway, and AI firm Magic.


“Video generation is rapidly advancing toward longer context and more flexible, agent-driven creative workflows,” said Cristóbal Valenzuela, CEO of Runway. “We see Rubin CPX as a major leap in performance, supporting these demanding workloads to build more general, intelligent creative tools.


Source: DCD

Read Also
ComEd breaks ground on 260MW substation set to serve Stream data center in Elk Grove, Illinois
Plans filed for data center site outside Columbia, South Carolina
India's Rovision signs MoU for data center near Navi Mumbai

Research