Nvidia is halting development of its dual-rack 72-way GB200-based NVL36×2, to focus on the single-rack NVL72 and NVL36, according to analyst Ming-Chi Kuo over at Medium. He’s a reputable analyst and appears to have inside information on the matter. The single-rack NVL36 and NVL72 machines will arrive to market as planned, with the decision said to be driven by limited resources and customer preferences. However, previous reports indicated that the dual-rack NVL36×2 would be the most popular choice among Nvidia’s customers.
The company initially planned to develop three GB200 models based on Blackwell GPUs for AI and HPC workloads: NVL36, NVL72, and NVL36×2. However, managing all three projects became challenging, especially given the complexity of working on two different 72-GPU versions (NVL72 and NVL36×2) at the same time. As a result, Nvidia is now focusing solely on NVL72 and NVL36.
Nvidia’s GB200 NVL72 rack contains 18 compute trays and nine NVSwitch trays (18 NVSwitch ASICs), each holding two Bianca boards with one Grace CPU and two Blackwell GPUs per board. This is Nvidia’s most powerful offering, though it’s also the most power hungry solution one as it consumes around 120kW.
SemiAnalysis expects this configuration to see limited use due to its extreme power and density requirements (typical rack power is 12kW, while an H100-based rack consumes around 40kW), which most datacenters cannot support. However, there’s one major client planning widespread deployment, and Ming-Chi Kuo claims that Microsoft has shown a clear preference for the NVL72 over the NVL36×2.
The GB200 NVL36×2 was to consist of two interconnected racks, and was initially projected to be the more commonly adopted configuration. Each rack has 18 Grace CPUs and 36 Blackwell GPUs, maintaining full connectivity across the 72 GPUs. However, it would need 36 NVSwitch ASICs, thus consuming more power than one NVL72 and offering slightly lower performance. One GB200 NVL36×2 was projected to consume 66kW per rack (132kW in total), slightly higher than the NVL72, though it’s larger size would be more compatible with existing datacenters.
GB200 NVL72 is far more space efficient than GB200 NVL36×2. However, most Nvidia customers cannot support NVL72’s power and cooling density requirements. Furthermore, these complexities could delay shipments of GB200 NVL72 to the second half of 2025, according to Ming-Chi Kuo. However, previous reports indicated that some NVL72 machines will be delivered this December, presumably to Microsoft.
“My latest supply chain survey indicates that NVL72 mass production may be delayed until 2H25 (versus Nvidia’s optimistic target of 1H25),” Ming-Chi Kuo wrote.
Earlier this year, Nvidia ran into yield-killing issues with the packaging of its B100 and B200 GPUs for AI and HPC, which prompted it to produce low-yielding Blackwell hardware to meet demand as well as refine design of these processors. Refined GPUs are set to enter mass production only starting in late October, so they will be ready to use in late January. In this context, focusing on a GB200-based NVL72 design aimed at the most demanding customers looking for maximum performance makes absolute sense for Nvidia.
It should also be noted that x86-based servers with Blackwell processors are only due in 2025. At this stage, the form factors for these machines remain unknown, with preliminary reports pointing to NVL72 and NVL36×2 machines. It’s likely that has now pivoted to NVL72 and NVL36 racks first, with custom third-party solutions coming later.