HPC has quickly evolved in response the massive data growth tred. This is a good reason why organizations of all types are seeking to deliver maximum insight from their data to drive innovation. To achieve this outcome requires new analytics approaches that combine modeling and simulation with analytics and AI workloads. These new converged workloads will require new developer and operator workflows to power them that legacy HPC infrastructure cannot easily address.
These trends currently affect every industry and field of inquiry. A recent Intersect360 study found out that the majority (61%) of the HPC users today already are running machine learning programs[1]. And an additional 10% of the respondents stated that they plan to do so by the end of the year 2020. This is an inflection point for a new era in computing commonly referred to as the Exascale Era.
As with previous inflection points—such as the rise of virtualization and the adoption of cloud, big data, and AI, legacy hardware and software infrastructure has had to radically evolve to keep up with new requirements. This time is no different. The new converged HPC, analytics, and AI workflows will be fueled by new dataflows that deliver the right data, at the right time, and with the right economics. Storage technology that worked for petascale era workloads cannot power the Exascale Era’s converged workflows because the input/output (I/O) patterns of the applications and the characteristics of the currently deployed storage technologies could not be more different.
Traditional modeling and simulation typically have I/O patterns that serially access larger datasets whereas AI/machine learning can include both batch and random I/O access ranging in size from very small (i.e. a single inference) to very large (i.e. ML model training). Staying with current HPC storage infrastructures will leave users unable to keep up in terms of both performance and budget. This can only be addressed by a new type of HPC storage.
- With the traditional HPC storage systems, users will experience I/O bottlenecks for their AI/machine learning workloads as traditional HPC storage is not suited well to serve the large number of files of all sizes that machine learning needs to read in the training phase. That can lead to job pipeline congestion, missed deadlines, unsatisfied data scientists, and constant escalations.
- Alternately, if users try to scale their traditional enterprise AI storage to the potentially multi-petabyte requirements of converged workloads, they most likely will experience scalability issues and exploding storage costs.
Introducing Cray ClusterStor E1000
The Cray ClusterStor E1000, built with AMD EPYC ™ processors was purpose-engineered for this new era – scalable, cost-effective and delivering the performance needed to power a new kind of dataflow. It brings together the best of traditional HPC and modern all-flash enterprise file storage systems. The Cray ClusterStor E1000 system in combination with new services and flexible consumption models from HPE redefines what is possible for HPC storage users.
Here are just a few examples of what it can do
- Remove I/O bottlenecks through unprecedented performanceby delivering up to 80 gigabytes per second throughput performance in just two rack units with the help of the performance capabilities of the AMD EPYC™ processor
- Achieve a balance of scale, performance, and performance efficiency by providing up to 3.3. gigabyte per second file system performance from just one NVMe Gen 4 SSD
- Deliver broad interoperability with any HPC cluster or supercomputer of any vendor that supports modern, high-speed interconnects like EDR/HDR InfiniBand, 100/200 Gigabit Ethernet or 200 Gbps Cray Slingshot
- Unify the support for the full HPC infrastructure stack with HPE Pointnext Services and create clear accountability for the providers of both HPC compute and storage
- Provide a future path to an “as-a-service” model for the full HPC infrastructure stack with HPE GreenLake, combining the agility and economics of public cloud consumption with the security and performance of on-premises HPC
Learn more about the Cray ClusterStor E1000 Exascale Era storage.