NVIDIA is set to revolutionize AI inference with its new Rubin CPX GPUs, designed to operate as co-processors alongside the high-bandwidth memory (HBM) Rubin GPUs within the NVL144 racks. This strategic integration, expected by late 2026, leverages GDDR7 memory in the CPX units to optimize specific AI workloads. "The new NVIDIA Rubin CPX is adding multiple GDDR7 memory GPUs alongside its 2026 Rubin HBM GPUs in the same NVL144 racks. Effectively, the big HBM Rubin GPUs are being equipped with GDDR7 Rubin CPX GPUs as co-processors," noted AI expert Brian Roemmele, who is actively analyzing the theoretical speed-up potential of this architecture.
The introduction of Rubin CPX addresses the distinct demands of AI inference, particularly the prefill phase, which is compute-intensive and less dependent on memory bandwidth. By utilizing more cost-efficient GDDR7 memory for these tasks, NVIDIA aims to enhance overall system efficiency and reduce operational costs. This specialized approach allows the HBM-equipped Rubin GPUs to focus on the memory-bandwidth-intensive decode phase, creating a disaggregated and highly optimized inference pipeline.
The Vera Rubin NVL144 CPX platform, housing both GPU types, is engineered to deliver an impressive 8 exaflops of AI performance. Each Rubin CPX GPU features 128GB of GDDR7 memory and provides up to 30 petaFLOPs of NVFP4 compute. The integrated rack system will boast 100TB of fast memory and a staggering 1.7 petabytes per second of memory bandwidth, marking a significant leap in AI computing capabilities.
This innovative architecture is poised to transform massive-context processing for applications such as million-token software coding and advanced generative video. The strategic pairing of GDDR7-based CPX co-processors with HBM Rubin GPUs underscores NVIDIA's commitment to pushing the boundaries of AI performance and efficiency. Industry observers, like Roemmele, are keenly watching the impact of this development on future AI system designs and their potential for unprecedented speed and scale.