HPC in Life Sciences 2020 Part 2: Setting the Stage for AI’s Opening Act

By John Russell

June 10, 2020

The rise of AI – machine and deep learning – in life sciences has stirred the same excitement and same ‘fits-and-starts’ reality as elsewhere. Today, AI is mostly used in two areas: 1) embedded in life science instruments such as cryo-electron microscopes where it assists in feature recognition and lies largely hidden from users, and 2) in high profile projects such as NCI-DoE’s CANcer Distributed Learning Environment (CANDLE) which has access to supercomputing capacity and plenty of AI expertise. AI is, of course, being used against COVID19 at many large labs. There’s not much in between.

The truth is that AI remains in its infancy elsewhere in life sciences. The appetite for AI use is high but so are the early obstacles – not least the hype which has already ticked off clinicians, but also uneven data quality, non-optimal compute infrastructure, and limited AI expertise. On the plus side, practical pilot programs are now sprouting.

For this second part of the 2020 HPC in Life Sciences report HPCwire talked with Fernanda Foertter, BioTeam’s recent hire and now a senior scientific consultant specializing in AI. She joined BioTeam from Nvidia where she worked on AI in healthcare. Before that, Foertter was at Oak Ridge National Laboratory as a data scientist working on scalable algorithms for data analysis and deep learning. Her comments are blended with those from HPCwire’s earlier interview with Ari Berman, BioTeam CEO, Chris Dagdigian, co-founder, and Mike Steeves, senior scientific consultant.

Foertter sets the stage nicely:

TALE OF 2 DISCIPLINES – AI HUNGER IN RESEARCH NOT MATCHED IN THE CLINIC (YET)

Fernanda Foertter, BioTeam

“Let’s talk a little bit on the historical side. If you look back at 2016, some of the life sciences related AI work that DOE was doing on supercomputers, we have the CANDLE project whose three pilots had the main basic AI applications in life sciences – so NLP (natural language processing) on SEER data; using AI for accelerating molecular dynamics; and drug discovery (drug screening). The first thing that really started to pick up was NLP. I know the NLP pilot is actually working really well today, [and] the molecular dynamics pilot is working really well too. The drug discovery didn’t have the right data to begin with and was delayed while in-vivo was being generated. “

“So those are the initial applications areas and as GPU memories grew larger, medical imaging started to mature within AI. These are the main areas in the life sciences today. where AI has been applied successfully – but still, all of these remain in the research realm. If you ask me, NLP is probably the easiest place that could go from research to production. Searching through text for context, for example giving people at least some idea that yes, something has been published before, or, we have something in this documentation here in our hospital, or some other patients’ file describes something similar, I think that these examples could actually work really well in the clinical setting.”

Also notable is that HPC, often with AI, has been used extensively in COVD-19 research (see HPCwire’s extensive coverage of HPC efforts fighting the pandemic). It seems reasonable to expect those efforts will not only bear fruit against COVID-19 but also generate new approaches for using HPC plus AI in life sciences research.

However, making AI work in the clinic has turned out to be challenging.

“When I went to Nvidia (Feb. 2018) the application that was most promising and fastest changing was on the medical imaging side. Even in the short, two years that I was there a lot changed. We went from talking about “can we discover pneumonia? can we discover tumors?” to talking about being able to grade tumors, which is much more refined. The one application I think everybody wishes would happen really, really quickly, but hasn’t really materialized is digital pathology,” said Foertter.

“The real-life pathology workflow is very difficult. Somebody, hopefully an experienced pathologist, has to go through a picture that has a very high pixel count, ie., very high resolution. They choose a few areas to investigate based on their experience because they cannot study the image in its entirety, and the pathologist will make a diagnosis. The miss rate is still somewhat absurdly high, especially for novice pathologists, anywhere between 30% to 40%. That means you can send somebody’s sample to pathology and depending on who sees it they’re going to miss that you have cancer or some other etiology. The reason [digital pathology] hasn’t really materialized is partly because the images are so large. To do any sort of convolution neural network on a really large image to train it would just break it. We were just doing the debugging of 3D convolutional neural networks for things like MRI reconstruction and digital pathology last year. It’s still, I think, not quite there. Just the memory size is really hard.”

Also problematic was the over-selling of AI’s capability to the clinician community replete with sometimes explicit claims clinicians would become obsolete.

“There has been a lot of animosity from physicians, justifiably so. The whole AI thing was kind of sold as if it could replace a lot of clinicians. And they [AI researchers] quickly corrected that, that kind of hubristic language,” said Foertter, who compares the situation to the initial reaction of airline pilots to fly-by-wire systems. “My brother flies a 737. He loves it because it’s not fly-by-wire. There’s still mechanical systems in there. That trust in fly-by-wire still hasn’t percolated through much of the pilot community. I feel the medical community will need to feel the trust and [be able] to investigate the data they don’t agree with or when they don’t understand the output. AI is still seen as a black box.”

The emerging picture is one of accelerating AI adoption in biomedical research but a slowness in the clinic that seems unlikely to change near-term. The latter seems driven by still nascent AI expertise, computational infrastructure shortcomings, and FDA requirements for underlying hardware and software used in the clinic. Foertter offers two examples:

  • “One of the examples I like to use is Google’s project where they took fundus images – the back of the eye – and trained it to figure out certain things like cardiovascular risk and diabetes. They actually created an app out of the work and deployed it on their cloud. They collaborated with clinicians in Thailand who were enthusiastic, based on the initial results showing accuracy as high as 95% in diagnosis.  But two years later Google realized that the 95% accuracy in development dropped substantially in the field, because number one, the quality of the images that it was trained on was much higher than Thailand had available coming out of their imaging equipment, and number two, the infrastructure wasn’t available for them to upload to the cloud when they had these [high quality image data] available.”
  • The bulk of the [AI] talent and resources are located in Europe and US. These places have very stringent requirements, not in a negative sense but a safety and privacy sense, [for] hardware, software, and data. If you want to put a GPU, for example, in a medical instrument, which is an example that we had at Nvidia, by the time it goes to the FDA approval, the technology that was used to obtain regulatory approval was probably five years old? If you think about putting a 2015 GPU in the [approved] machine, that GPU number one is no longer in production, and number two the methodologies have changed.

Advances in AI technology and underlying methodologies are advancing too quickly for easy absorption into the clinic. Foertter and Berman both credit FDA efforts in trying to overcome some of these problems.

DOE’S TITAN SUPERCOMPUTER PRIMED THE AI PUMP

The reluctance to adopt AI in the clinic is completely reversed in biomedical research where there is something of a stampede building with enthusiasm for AI running rampant. Time will tell how the herd fares. BioTeam reports there’s a general lack of knowledge about basic AI-enabling technology and requirements along with uncertainty about picking projects.

Said Dagdigian, “Currently we are mostly enabling the ML/DL stuff from an IT infrastructure, you know, getting the systems up and running either on premise or in the cloud on helping them configure the storage, which needs to be higher performance. I will say that the people who can afford the DGXs seem to really love them, but other than that most of the people I’m working with are still on [CPU-centric systems]. They’re not picking the product as much they’re picking the toolkit or the environment. People are like, OK, we’re all in on TensorFlow and we’re going to start scaling up and training on that.”

Berman noted, “Some other things have started coming out like using AI frameworks to reduce the possibilities of drug targets [by assessing pocket binding characteristics matches] which is something that is being used massively at the moment in the coronavirus pandemic. That combined with electrodynamic simulations to verify that likelihood [of binding] is high and taking it into the lab to verify has the potential to reduce the time to drug development significantly, not to mention the cost.”

In terms of technology preferences, Berman said, “It really just depends on what the applications are. Most folks are just interested in having a small or large army of Nvidia Volta 100s in their arsenal and they can use several different generations of those just because that stuff gets a little expensive. Academics, who can rest on the non-production licenses, can use things like the Nvidia 1080 Ti or something like that. GPUs in general are still the preferred methodology for training, although there’s some recent innovations in the software stack.”

Foertter’s experience in high-end HPC, notably at ORNL, provides an interesting perspective on accelerator use (GPUs, specialty chips (e.g. Cerebras), FPGAs et al); this is especially so given Intel’s jump back into GPUs and AMD’s recent resurgence.

The Titan supercomputer. Image courtesy of OLCF.

“Why is AI so popular now? Where did it begin? This resurgence began primarily because of HPC leveraging GPUs for scientific purposes. The same accelerated linear algebra libraries that are underneath the frameworks being used in AI were built because of Titan (DOE’s first heterogeneous GPU-centric supercomputer, ~2012). They were built for HPC and some bright folks realized they could leverage these for deep learning and all of a sudden AI explodes in 2014 – 2015. So, we’ve only been five years at building these math libraries and AI frameworks,” said Foertter.

“Now fast forward a year, around  ~2016. Google says, well, we can probably build a special purpose TPU. They’re in the fourth generation of the TPU, and it still hasn’t quite taken off even within Google where there is still a dependence on GPUs. These specialty chips still require an underlying performant math library, and building these is not trivial. You can’t just kind of say, “Well, here’s a chip, it can do matrix multiply.” Okay, how do I integrate that into the framework? What libraries do I call?”

“I think there’s a good possibility that Intel, because Intel has had experience with building these HPC math libraries, can jump ahead even of AMD, even though AMD has also got the backing of DOE to create the same libraries for their GPU. Once those libraries exist for those newcomer GPUs, once AI frameworks have been built in a way that they can call any vendor’s version of BLAS for example, it’ll be fairly simple to run on emerging accelerators.

“I’m not sure that the special purpose stuff is really going to take off unless there’s somebody that really needs large models, very special models, you know, like the 3D convolutional nets; if a problem requires hardware that isn’t available from commodity options, it may be worth it for somebody to build it, sort of an FPGA style. But I don’t see those taking off and I don’t see a Google TPU taking off really, it’s too soon to tell,” said Foertter.

Not surprisingly the range of technology options available varies widely. DoE and NSF have big systems now and have been aggressively adding more (see NSF/PSC’s recent big bet on a Cerebras-based system). Big Pharma and big biotechs often have sophisticated systems, but mostly traditional CPU-centric architectures. Smaller biotechs and much of academia often have more limited computational resources. As a rule these days, a basic AI server has a couple of CPUs and 2-to-8 GPUs and virtually all major vendors have an AI line-up but even these are not cheap.

“The ability to put things on the cloud and burst to the cloud is a great place to try the new technologies and not have to worry about calling IT and racking it. I mean, I don’t know if you know about what it takes to rack up a DGX-2, but it takes a special team. It’s a big piece of equipment. You have to pay a team and they have to be certified to install that. It’s not something that you can kind of get off the shelf,” said Foertter. The cloud makes sense [when] there’s not a lot of IO, which is what usually costs a lot. If you’re in that scenario where there’s not a lot of IO, or you don’t have any other kind of legal constraints, where data can’t leave, or can’t stay, or HIPAA, the cloud is good place to get started.”

SHOULD AI’S HEAD BE IN THE CLOUDS?

 The big cloud providers have all been adding accelerators of one or another ilk and adding (or developing) AI frameworks and integration tools. Berman notes Amazon’s SageMaker (ML development and deployment) is well integrated with other AWS services. Google, he said, also has powerful capabilities though they are somewhat more complicated to use.

The STRIDES Initiative from NIH, begun in July 2018 with Google and Amazon, is intended to help provide discounted access to cloud-services and various data repositories. Its benefits, as described by NIH include:

  • Discounts on STRIDES¹ Initiative partner services – Favorable pricing on computing, storage, and related cloud services for NIH Institutes, Centers, and Offices (ICOs) and NIH-funded institutions.
  • Professional services – Access to professional service consultations and technical support from the STRIDES Initiative partners.
  • Training – Access to training for researchers, data owners, and others to help ensure optimal use of available tools and technologies.
  • Potential collaborative engagements – Opportunities to explore methods and approaches that may advance NIH’s biomedical research objectives (with scope and milestones of engagements agreed upon separately).
Ari Berman, BioTeam

Said Berman, “STRIDES has generated broad interest for good reasons. There have been successful pilots with large datasets and large data platforms that NIH curates utilizing both GCP and AWS, and those have been really big wins. You know, STRIDES has been an integral part of the COVID-19 response that is going on right now and will continue to be. So I’d say as an initial release, especially for sophisticated groups, it’s been very positive. I think what’s coming down [in the future] is how to make it more accessible to people who don’t know how to operate clouds, and to implement platforms and things like that.”

Access to AI technology isn’t really the problem suggests BioTeam. It’s access to curated and integrated data that is a stumbling block. That and picking as starting (pilot) project. This is true even in big companies.

“Pharma companies generate tons of data. Some have been in the business for 40 years, like Johnson and Johnson or GSK. Along the way they’ve purchased 10, 15, 20 different Pharmas, and you’d be surprised at how many of those different acquired databases have never been integrated and the data are not harmonized. It’s amazing data, but it’s never been brought together. How do you use that? You need somebody that understands all these databases.”

Berman agreed. “As Fernanda laid out, data are so siloed. There are standards; there’s just 37 of them. Right? And they don’t talk to each other. There are entire consortia trying to figure this out. They’re trying to reduce the 37 to like eight, which still is too many. What I think is actually going to happen is that some of the existing standards are going to be come out stronger than others. Those standards are going to be used to create individual platforms that don’t talk to each other that are still siloed, so it has the effect of coalescing some of the data, but you still can’t mix the platforms. That’s what I think will happen. That also will enable a greater degree of AI and analytics to come in, but it’s still not going to realize the ability to query across all the data.”

Given the still fragmented and nascent world of AI, just getting started is currently the biggest challenge for many in commercial and academic life science research. Clinicians are a step or two behind them. While that sounds a bit formulaic and consultant-speak(ish), it also has the subdued ring of current reality. AI is still poorly defined. Machine learning and deep learning are concrete. But tools (hardware and s/w) for making practical use of ML and DL are emerging and now is likely the time to test the waters.

Said Foertter, “Realistically, they should be trying to do something. A lot of folks just don’t know where to start. They know that AI can bring them some value. They know they would like to explore to see if it could bring them value. But a lot of folks say they are just kind of waiting for somebody to sell them something. Here’s something you can put your data in and outcomes magic. That’s not what where are right now. That’s still probably another five years away to get to the point where people are going to be selling these sorts of ready-to-go, train-your-own solutions.”

BioTeam’s advice is to start small – a small pilot for which you have the data available and you have people that understand those data. Start with a small question, using a cloud resource or if you have adequate IT internal resources, and see where it goes.

Link to part one (HPC in Life Sciences 2020: the Rise of AMD, Taming Data Management’s Wild West), https://www.hpcwire.com/2020/05/20/hpc-in-life-sciences-2020-part-1-rise-of-amd-data-managements-wild-west-more/

Footnote 1 STRIDES is the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability,  https://datascience.nih.gov/strides

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) in the UK with plans to explore use of Nvidia BlueField DPU Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impact on how large a piece of the DL pie a user can finally enj Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized i Read more…

Nvidia Aims Clara Healthcare at Drug Discovery, Imaging via DGX

April 12, 2021

Nvidia Corp. continues to expand its Clara healthcare platform with the addition of computational drug discovery and medical imaging tools based on its DGX A100 platform, related InfiniBand networking and its AGX develop Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU technology conference, held virtually once more due to the ongoing pandemic, the company announced its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. Read more…

AWS Solution Channel

Volkswagen Passenger Cars Uses NICE DCV for High-Performance 3D Remote Visualization

 

Volkswagen Passenger Cars has been one of the world’s largest car manufacturers for over 70 years. The company delivers more than 6 million automobiles to global customers every year, from 50 production locations on five continents. Read more…

Nvidia Debuts BlueField-3 – Its Next DPU with Big Plans for an Expanded Role

April 12, 2021

Nvidia today announced its next generation data processing unit (DPU) – BlueField-3 – adding more substance to its evolving concept of the DPU as a full-fledged partner to CPUs and GPUs in delivering advanced computi Read more…

GTC21: Dell Building Cloud Native Supercomputers at U Cambridge and Durham

April 14, 2021

In conjunction with GTC21, Dell Technologies today announced new supercomputers at universities across DiRAC (Distributed Research utilizing Advanced Computing) Read more…

The Role and Potential of CPUs in Deep Learning

April 14, 2021

Deep learning (DL) applications have unique architectural characteristics and efficiency requirements. Hence, the choice of computing system has a profound impa Read more…

Nvidia Serves Up Its First Arm Datacenter CPU ‘Grace’ During Kitchen Keynote

April 12, 2021

Today at Nvidia’s annual spring GPU technology conference, held virtually once more due to the ongoing pandemic, the company announced its first ever Arm-based CPU, called Grace in honor of the famous American programmer Grace Hopper. Read more…

Nvidia Debuts BlueField-3 – Its Next DPU with Big Plans for an Expanded Role

April 12, 2021

Nvidia today announced its next generation data processing unit (DPU) – BlueField-3 – adding more substance to its evolving concept of the DPU as a full-fle Read more…

Nvidia’s Newly DPU-Enabled SuperPod Is a Multi-Tenant, Cloud-Native Supercomputer

April 12, 2021

At GTC 2021, Nvidia has announced an upgraded iteration of its DGX SuperPods, calling the new offering “the first cloud-native, multi-tenant supercomputer.” Read more…

Tune in to Watch Nvidia’s GTC21 Keynote with Jensen Huang – Recording Now Available

April 12, 2021

Join HPCwire right here on Monday, April 12, at 8:30 am PT to see the Nvidia GTC21 keynote from Nvidia’s CEO, Jensen Huang, livestreamed in its entirety. Hosted by HPCwire, you can click to join the Huang keynote on our livestream to hear Nvidia’s expected news and... Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire