Nvidia’s latest GPU, the T4, continues to rack up wins. Unlike the V100, which is geared for traditional HPC scale-up environments including model training, the T4 is aimed more at inference applications and scale-out environments. From GTC China yesterday (Nov. 20), Nvidia announced that Baidu, Tencent, JD.com and iFLYTEK have begun using “T4 to expand and accelerate their hyperscale datacenters.” In addition, China’s computer makers Inspur, Inspur Power Systems, Lenovo, QCT, Huawei, Sugon and H3C announced a wide range of new T4 servers.
In his GTC China keynote, CEO Jensen Huang referred to the T4 as Nvidia’s “first hyperscale GPU” noting that it went into production just 30 days ago and is the fastest adopted server GPU in the history of the company. Since its introduction, it has on the order of 57 design wins.
“The continued rapid adoption of T4 makes complete sense, given its unprecedented capabilities,” said Ian Buck, vice president of accelerated computing at Nvidia. “Never before have we introduced a GPU that gives public and private clouds the combined performance and energy efficiency they need to more economically run their compute-intensive workloads at scale. And in markets where ‘scale’ really counts, we expect T4 to be extremely popular.”
Nvidia also announced adopters of its Nvidia HGX-2 server platform intended for AI deep learning, machine learning and high performance computing. Baidu and Tencent are using HGX-2 for a wide range of AI services for internal use and for their cloud customers. Inspur is China’s first to build an HGX-2 server, and Huawei, Lenovo and Sugon all announced that they have become Nvidia HGX-2 server partners.
Nvidia claims the HGX-2 can run AI machine learning workloads nearly 550x faster, AI deep learning workloads nearly 300x faster and HPC workloads nearly 160x faster than a CPU-only server.
At SC18 last week, Nvidia announced Google had deployed the T4. Among previously announced server companies featuring the Nvidia T4 are Dell EMC, Hewlett Packard Enterprise, IBM, Lenovo, and Supermicro. The T4 is based on Nvidia’s Turing architecture and features multi-precision Turing Tensor Cores and new RT Cores. (For a full look at Nvidia’s SC18 announcements see HPCwire article, Nvidia’s Jensen Huang Delivers Vision for the New HPC.)
The newly announced servers include:
- Inspur NF5280M4 /NF5280M5/NF5288M5/NF5468M5
- Huawei G2500/2288 HV5/ 5288V5/G530 V5/G560 V5
- Lenovo ThinkSystem SR630/SR650
- Sugon X580-G30/X745-G30/X780-G30/X780-G35/X785-G30 / X740-H30
- Inspur Power System: FP5295G2
- H3C Uniserver G4900G3
The T4 systems announced are expected to begin shipping before the end of the year, reports Nvidia.
The HGX-2 systems announced incorporate such features as Nvidia NVSwitch interconnect fabric, which links 16 Tesla V100 Tensor Core GPUs to work as a single GPU delivering two petaflops of AI performance. It also provides 0.5TB of memory and 16TB/s of aggregate memory bandwidth.
Link to Nvidia T4 news release: https://nvidianews.nvidia.com/news/nvidia-turing-t4-cloud-gpu-adoption-accelerates
Link to Nvidia HGX-2 news release: https://nvidianews.nvidia.com/news/nvidia-hgx-2-gpu-accelerated-platform-gains-broad-adoption
Watch Nvidia CEO Jensen Huang address 5,000+ attendees of the GTC Technology Conference in Suzhou, China: