Google kicked off Google I/O this afternoon by talking for more than an hour about its numerous advances in artificial intelligence. The company discussed its new PaLM 2 large language model (LLM) for generative AI, which powers the Bard chatbot tool. This is a foundational pillar for adding AI-infused features across Google’s product portfolio, including Google Maps, Google Photos, and Gmail (among others).
With that in mind, there is a need for some serious horsepower in the cloud to power models in the wild, as millions (and eventually billions) of users send requests for operations as mundane as removing a person lingering in the background of a picture to composing an entire email for you based on a short text prompt. That’s where Google’s new A3 GPU supercomputer comes into focus. Google says the new A3 supercomputers are “purpose-built to train and serve the most demanding AI models that power today’s generative AI and large language model innovation” while delivering 26 exaFlops of AI performance.
Each A3 supercomputer is packed with 4th generation Intel Xeon Scalable processors backed by 2TB of DDR5-4800 memory. But the real “brains” of the operation come from the eight Nvidia H100 “Hopper” GPUs, which have access to 3.6 TBps of bisectional bandwidth by leveraging NVLink 4.0 and NVSwitch.
According to Google, A3 represents the first production-level deployment of its GPU-to-GPU data interface, which allows for sharing data at 200 Gbps while bypassing the host CPU. This interface, which Google calls the Infrastructure Processing Unit (IPU), results in a 10x uplift in available network bandwidth for A3 virtual machines (VM) compared to A2 VMs.
“Google Cloud’s A3 VMs, powered by next-generation NVIDIA H100 GPUs, will accelerate training and serving of generative AI applications,” said Ian Buck, VP for hyperscale and high-performance computing at NVIDIA. “On the heels of Google Cloud’s recently launched G2 instances, we’re proud to continue our work with Google Cloud to help transform enterprises around the world with purpose-built AI infrastructure.”
If your business wants to leverage A3 virtual machines, the only way to gain access is by filling out Google’s A3 Preview Interest Form to join the Early Access Program. But as Google clearly states, plugging in your information doesn’t guarantee a spot in the program.