If it’s Nvidia GPUs you’re after to power your AI/HPC/visualization workload, Google Cloud has them, now claiming “broadest GPU availability.” Each of the three big public cloud vendors has by turn touted the latest and greatest Nvidia hardware, but this time it’s Google Cloud’s turn as first mover. In November, the cloud provider announced early access to Tesla T4 “Turing” GPUs and today has followed up that introduction with public availability and greater global reach.
In addition to putting T4 instances in its U.S. and Netherlands GCP regions, with this beta launch Google is extending the GPU option to Brazil, India, Singapore, and Tokyo datacenters, marking the first time GPUs have been offered in those GCP regions.
“We’ve distributed our T4 GPUs across the globe in eight regions, allowing you to provide low latency solutions to your customers no matter where they are,” wrote Chris Kleban, product manager, Cloud GPUs, in a blog post this morning. “The T4 joins our Nvidia K80, P4, P100 and V100 GPU offerings, providing customers with a wide selection of hardware-accelerated compute options.”
Based on Nvidia’s Turing architecture (unveiled in August), the T4 is the successor to the P4 Pascal-based chips, introduced in 2016. Incorporating 320 Turing Tensor Cores and 2,560 CUDA cores, the T4 claims a theoretical 8.1 teraflops of single-precision performance, 65 teraflops of mixed-precision, 130 teraops of INT8 and 260 teraops of INT4 performance. Google notes that the T4’s 16 GB of memory benefits both large training models and the running of many smaller inference models.
Google expects that V100 GPU instances will continue to be the most popular option for training workloads in the cloud, but with a lower price-point, the T4 is great choice for scale-out distributed training or when the full power of a V100 is not needed, according to the web giant.
Currently, Google Cloud is the only major cloud provider offering T4 GPUs, positioning the parts as complementary to its V100-backed instances. “You can scale up with large VMs up to eight V100 GPUs, scale down with lower cost T4 GPUs or scale out with either T4 or V100 GPUs based on your workload characteristic,” said Kleban.
A fully-equipped T4 machine gets you four GPUs, 96 vCPUs, 624GB of host memory with the option to add 3TB of in-server local SSD. In the case of preemptible VM instances T4 GPUs can be used for as low as $0.29 per hour per GPU with on-demand instances starting at $0.95 per hour per GPU. Google offers sustained use discounts of up to 30 percent.
The Google Cloud AI team also published a technical blog for developers describing how to employ T4 GPUs in conjunction with the Nvidia TensorRT platform in order to run deep learning inference on large-scale workloads.