Interest in artificial intelligence (AI) is rising fast, as practical applications in speech, image analysis, and other complex pattern recognition tasks have now surpassed human accuracy. Once trained, a neural network can be deployed on edge devices or in the cloud to meet demands in near real-time. Yet training a deep neural network within a reasonable time frame requires the speed and scale of a high performance computing (HPC) system.
Like many other HPC workloads, neural network training is both compute- and data-intensive. It stresses every aspect of cluster design, from floating-point performance, to memory-bandwidth and capacity, to message latency and network bandwidth. Powerful processors are essential, but not sufficient to meet these intense demands.
Intel® Scalable System Framework (Intel® SSF) is designed to address the technology barriers that currently limit performance for neural network training and other HPC workloads. The framework accomplishes this by delivering balanced high-performance at every layer of the solution stack—compute, memory, storage, fabric, and software. This holistic, system-level approach simplifies the design of optimized clusters and helps organizations take advantage of disruptive new technologies with less effort and lower risk.
That’s a good thing, because disruptive new technologies are coming fast.
Powerful Compute for AI Acceleration
Intel® Xeon® processors and Intel® Xeon Phi™ processors are key compute components of Intel SSF. Intel announcements at SC16 highlighted their success versus GPUs in addressing the performance and scalability challenges of deep neural networks. This is just the beginning. The Intel processor roadmap is poised to deliver a 100X increase in neural network training performance within the next three years[1], shattering the performance barriers that currently slow AI innovation. Intel SSF will help to unleash the full potential of these and other Intel processor innovations.
Groundbreaking Memory and Storage Technologies
The performance gap between processors and memory/storage solutions has been widening for decades, requiring ever-more complex workarounds. Beginning with the current Intel Xeon Phi processor family, Intel is offering up to 16 GB of fast on-chip memory to help resolve the data access bottleneck. This, too, is just a beginning. Intel breakthroughs in memory and storage technology are beginning to enter the market now, and are pivotal to the Intel SSF roadmap. These innovations will allow memory and storage to finally catch up with processor performance, enabling transformative new efficiencies that will redefine what is possible and affordable in AI and other data-driven fields.
A Fabric for the Future of AI
Neural network training is a tightly-coupled application that alternates compute-intensive number crunching with cluster-wide data sharing, so fabric performance is critical. Intel SSF addresses this challenge with Intel® Omni-Path Architecture (Intel® OPA), which matches the line speed of EDR InfiniBand and includes optimizations to improve message passing efficiency, fabric scalability, and cost models. Today’s Intel Xeon Phi processors offer integrated Intel OPA controllers to further reduce latency and cost. Ongoing processor and fabric integration will increase efficiency at every scale and provide a cost-effective foundation for the massive neural networks of tomorrow.
Optimized Software that Ties It All Together
Although foundational AI algorithms have been around since the mid 1960s, they were designed for functionality, not performance. Intel is working with vendors and the open source community to deliver software that is highly optimized for performance on Intel architecture across the full breadth of AI and HPC needs. This includes everything from core math libraries and machine learning frameworks, to memory- and logic-based AI applications. It also includes essential system software, such as Intel® HPC Orchestrator, which helps to simplify the design, deployment, and use of high-performing systems that can scale cost-effectively to support extreme requirements.
A Launching Pad for AI Innovation
The next wave of AI innovation will require enormous new computing capability. Intel SSF provides a unified platform that enables a leap forward in performance and efficiency for AI and a host of other HPC workloads, including big data analytics, data visualization, and digital simulation.
As innovation heats up, the advantages will grow. Intel SSF will help AI innovators ride the wave of escalating performance while maintaining application compatibility[2], so they can focus on driving deeper and more useful intelligence into almost everything they create.
We can’t wait to see the results.
Stay tuned for more articles focusing on the benefits Intel SSF brings to AI at each level of the solution stack through balanced innovation in compute, memory, storage, fabric, and software technologies.
[1] https://www.hpcwire.com/2016/11/21/intel-details-ai-hardware-strategy/
[2]The aim of Intel SSF is to help drive generation by generation performance gains that benefit existing software without requiring a recompile. Additional and in some cases massive gains may become possible through software optimization.