Nvidia’s T4 GPUs unveiled earlier this year for accelerating workloads such as AI inference and training are making their “global” debut as cloud instances on Google Cloud.
Google claimed Monday (April 29) it would be the first cloud provider to offer access to Nvidia T4 cloud instances across multiple regions, available now in beta. It announced T4 availability across eight regions earlier this month, making it first to offer Nvidia’s Tesla T4“ globally. (Amazon Web Services announced “G4” cloud instances based on T4 GPUs in March.)
The cloud rivals are offering T4 GPU instances based on Nvidia’s Turing architecture as the market for datacenter-based machine learning training and inference continues to boom. Nvidia estimates that as much as 90 percent of the cost of machine learning at scale is devoted to AI inference.
Nvidia rolled out the T4 last fall with the goal of accelerating machine learning training and inference within datacenters at a lower price point. For example, the T4 includes Tensor Cores to speed training along with hardware acceleration for faster ray tracing algorithms.
Tensor Core GPUs support so-called “mixed precision” training of machine learning workloads. Cloud vendors are offering T4 instances as an alternative to the higher-end V100 GPU. If training workloads don’t utilize more powerful V100 horsepower, which is aimed at traditional HPC workloads, “the T4 offers the acceleration benefits of Turing Tensor Cores, but at a lower price,” Google noted in a blog post announcing its new T4 cloud instances. “This is great for large training workloads, especially as you scale up more resources to train faster, or to train larger models.”
The partners also pitched T4 GPU cloud instances as a way to reduce latency and boost throughput for inference models, noting that Tensor Cores with mixed precision accelerated inference by a factor of as much as ten on the ResNet-50 image classification neural network.
The Nvidia T4 GPUs are also aimed at batch compute HPC and rendering workloads, such as those conducted by neuroscience researcher Sebastian Seunghad of Princeton University. “We are excited to partner with Google Cloud on a landmark achievement for neuroscience: reconstructing the connectome of a cubic millimeter of neocortex,” commented Seunghad. “It’s thrilling to wield thousands of T4 GPUs powered by Kubernetes Engine. These computational resources are allowing us to trace 5 km of neuronal wiring, and identify a billion synapses inside the tiny volume.”
Google said T4 instances can be accessed for as low as $0.29 per hour per GPU, with “on-demand” instances starting at $0.95 per hour per processor. It is also offering “sustained use discounts.”
Google Cloud also offers Nvidia V100 instances.
Google Cloud’s T4 GPU availability includes three regions each in the U.S. and Asia and one each in South America and Europe. Those regions are linked by a high-speed network.