IBM and NVIDIA Collaborate to Expand Open Source Machine Learning Tools for Data Scientists

Oct. 10, 2018 — IBM today announced that it plans to incorporate the new RAPIDS open source software into its enterprise-grade data science platform for on-premises, hybrid, and multicloud environments. With IBM’s vast portfolio of deep learning and machine learning solutions, it is best positioned to bring this open-source technology to data scientists regardless of their preferred deployment model.

“IBM has a long collaboration with NVIDIA that has shown demonstrable performance increases leveraging IBM technology, like the IBM POWER9 processor, in combination with NVIDIA GPUs,” said Bob Picciano, Senior Vice President of IBM Cognitive Systems. “We look to continue to aggressively push the performance boundaries of AI for our clients as we bring RAPIDS into the IBM portfolio.”

RAPIDS will help bring GPU acceleration capabilities to IBM offerings that take advantage of open source machine learning software including Apache Arrow, Pandas and scikit-learn. Immediate, wide ecosystem support for RAPIDS comes from key open-source contributors including Anaconda, BlazingDB, Graphistry, NERSC, PyData, INRIA, and Ursa Labs.

IBM is planning to bring RAPIDS to key areas across on-premises, public, hybrid, and multicloud environments, including:

PowerAI on IBM POWER9, to leverage RAPIDS to expand the options available to data scientists with new open source machine learning and analytics libraries. Accelerated workloads have been proven to get a direct benefit from the special engineering that NVIDIA and IBM have done around POWER9, including integration of NVIDIA NVLink® and NVIDIA Tesla Tensor Core GPUs. PowerAI is IBM’s software layer that optimizes how today’s data science and AI workloads run on heterogeneous computing systems, and our goal is for this improved performance trajectory for GPU accelerated workloads on POWER9 to continue with RAPIDS.
IBM Watson Studio and Watson Machine Learning, to take advantage of the power of NVIDIA GPUs so that data scientists and AI developers can build, deploy, and run faster models than CPU-only deployments for their AI applications in a multicloud environment with IBM Cloud Private for Data and IBM Cloud.
IBM Cloud, to users who choose machines equipped with GPUs will be able to apply the accelerated machine learning and analytics libraries in RAPIDS for their cloud applications and tap the benefits of machine learning.

“IBM and NVIDIA’s close collaboration over the years has helped leading enterprises and organizations around the world tackle some of the world’s largest problems,” said Ian Buck, vice president and general manager of Accelerated Computing at NVIDIA. “Now, with IBM taking advantage of RAPIDS open-source libraries announced today by NVIDIA, GPU accelerated machine learning is coming to data scientists, helping them analyze big data for insights faster than ever possible before.”

Machine learning is a form of AI that enables a system to learn from data rather than through explicit programming. Enterprises across multiple industries like retail, finance, and telecommunications, are either actively using machine learning or exploring machine learning for the potential value it offers to companies trying to leverage big data to help them better understand the subtle changes in behavior, preferences, or customer satisfaction.

Earlier this year, IBM set a record in a tera-scale machine learning benchmark, beating the previous record holder by 46x. Using an IBM Research-developed machine learning algorithm called IBM Snap Machine Learning (Snap ML) running on IBM Power Systems AC922 servers with NVIDIA Tesla V100 Tensor Core GPUs, IBM researchers trained a logistic regression classifier in 91.5 seconds using an online advertising dataset released by Criteo Labs with over 4 billion training examples.

Source: NVIDIA