AWS Integrates NVIDIA NVLink Fusion for Next-Gen AI Infrastructure
December 2, 2025
In a strategic move to address the soaring computational demands of modern artificial intelligence, Amazon Web Services (AWS) has announced a multi-generational collaboration with NVIDIA. The partnership, unveiled at AWS re:Invent, centers on integrating NVIDIA's NVLink Fusion platform to streamline the deployment of AWS's custom AI silicon, notably its upcoming Trainium4 accelerators, alongside its Graviton CPUs and Nitro virtualization infrastructure.
The initiative tackles a critical industry bottleneck. As AI models scale to hundreds of billions or even trillions of parameters, and architectures like mixture-of-experts (MoE) become prevalent, they require vast arrays of accelerators working in concert. This necessitates a high-bandwidth, low-latency scale-up network to connect entire racks of chips into a single, cohesive fabric. For hyperscalers like AWS, developing such a full-stack, rack-scale architecture—encompassing custom silicon, networking, power, cooling, and software—has traditionally represented a multi-billion dollar, multi-year endeavor fraught with supply chain complexity.
NVLink Fusion is designed to mitigate these challenges by offering a proven, modular platform. At its core is the NVLink Fusion chiplet, which AWS can integrate directly into its Trainium4 design. This connects the custom chip to the NVLink 6 scale-up interconnect and the Vera-Rubin NVLink Switch tray. This technology stack enables an NVLink Fusion rack to connect up to 72 custom ASICs in an all-to-all configuration, delivering an aggregate scale-up bandwidth of 260 TB/s. NVIDIA states that, when combined with its AI acceleration software, this approach can deliver up to three times the performance and revenue for AI inference compared to previous generations by creating a single, large-scale domain.
Beyond raw performance, the collaboration grants AWS access to NVIDIA's extensive ecosystem. This includes the modular MGX rack architecture, a portfolio of components from GPUs to DPUs, and a network of certified OEMs/ODMs and suppliers for everything from chassis to cooling systems. By leveraging this pre-validated supply chain and architecture, AWS aims to significantly reduce development costs, cut deployment risks, and accelerate its time-to-market for custom AI solutions.
The implications for the industry are substantial. This partnership signals a shift where leading cloud providers can more rapidly bring differentiated, custom silicon to market by building upon a standardized, high-performance networking backbone. For AWS, NVLink Fusion also enables a heterogeneous silicon strategy, allowing its Trainium4 chips, Graviton CPUs, and other components to coexist within the same rack-scale infrastructure footprint and management framework. This agility is crucial for meeting the intense and evolving demands of next-generation AI training and agentic AI workloads.
Source: nvidia