HPC in Life Sciences 2020 Part 2: Setting the Stage for AI’s Opening Act

By John Russell

June 10, 2020

The rise of AI – machine and deep learning – in life sciences has stirred the same excitement and same ‘fits-and-starts’ reality as elsewhere. Today, AI is mostly used in two areas: 1) embedded in life science instruments such as cryo-electron microscopes where it assists in feature recognition and lies largely hidden from users, and 2) in high profile projects such as NCI-DoE’s CANcer Distributed Learning Environment (CANDLE) which has access to supercomputing capacity and plenty of AI expertise. AI is, of course, being used against COVID19 at many large labs. There’s not much in between.

The truth is that AI remains in its infancy elsewhere in life sciences. The appetite for AI use is high but so are the early obstacles – not least the hype which has already ticked off clinicians, but also uneven data quality, non-optimal compute infrastructure, and limited AI expertise. On the plus side, practical pilot programs are now sprouting.

For this second part of the 2020 HPC in Life Sciences report HPCwire talked with Fernanda Foertter, BioTeam’s recent hire and now a senior scientific consultant specializing in AI. She joined BioTeam from Nvidia where she worked on AI in healthcare. Before that, Foertter was at Oak Ridge National Laboratory as a data scientist working on scalable algorithms for data analysis and deep learning. Her comments are blended with those from HPCwire’s earlier interview with Ari Berman, BioTeam CEO, Chris Dagdigian, co-founder, and Mike Steeves, senior scientific consultant.

Foertter sets the stage nicely:

TALE OF 2 DISCIPLINES – AI HUNGER IN RESEARCH NOT MATCHED IN THE CLINIC (YET)

Fernanda Foertter, BioTeam

“Let’s talk a little bit on the historical side. If you look back at 2016, some of the life sciences related AI work that DOE was doing on supercomputers, we have the CANDLE project whose three pilots had the main basic AI applications in life sciences – so NLP (natural language processing) on SEER data; using AI for accelerating molecular dynamics; and drug discovery (drug screening). The first thing that really started to pick up was NLP. I know the NLP pilot is actually working really well today, [and] the molecular dynamics pilot is working really well too. The drug discovery didn’t have the right data to begin with and was delayed while in-vivo was being generated. “

“So those are the initial applications areas and as GPU memories grew larger, medical imaging started to mature within AI. These are the main areas in the life sciences today. where AI has been applied successfully – but still, all of these remain in the research realm. If you ask me, NLP is probably the easiest place that could go from research to production. Searching through text for context, for example giving people at least some idea that yes, something has been published before, or, we have something in this documentation here in our hospital, or some other patients’ file describes something similar, I think that these examples could actually work really well in the clinical setting.”

Also notable is that HPC, often with AI, has been used extensively in COVD-19 research (see HPCwire’s extensive coverage of HPC efforts fighting the pandemic). It seems reasonable to expect those efforts will not only bear fruit against COVID-19 but also generate new approaches for using HPC plus AI in life sciences research.

However, making AI work in the clinic has turned out to be challenging.

“When I went to Nvidia (Feb. 2018) the application that was most promising and fastest changing was on the medical imaging side. Even in the short, two years that I was there a lot changed. We went from talking about “can we discover pneumonia? can we discover tumors?” to talking about being able to grade tumors, which is much more refined. The one application I think everybody wishes would happen really, really quickly, but hasn’t really materialized is digital pathology,” said Foertter.

“The real-life pathology workflow is very difficult. Somebody, hopefully an experienced pathologist, has to go through a picture that has a very high pixel count, ie., very high resolution. They choose a few areas to investigate based on their experience because they cannot study the image in its entirety, and the pathologist will make a diagnosis. The miss rate is still somewhat absurdly high, especially for novice pathologists, anywhere between 30% to 40%. That means you can send somebody’s sample to pathology and depending on who sees it they’re going to miss that you have cancer or some other etiology. The reason [digital pathology] hasn’t really materialized is partly because the images are so large. To do any sort of convolution neural network on a really large image to train it would just break it. We were just doing the debugging of 3D convolutional neural networks for things like MRI reconstruction and digital pathology last year. It’s still, I think, not quite there. Just the memory size is really hard.”

Also problematic was the over-selling of AI’s capability to the clinician community replete with sometimes explicit claims clinicians would become obsolete.

“There has been a lot of animosity from physicians, justifiably so. The whole AI thing was kind of sold as if it could replace a lot of clinicians. And they [AI researchers] quickly corrected that, that kind of hubristic language,” said Foertter, who compares the situation to the initial reaction of airline pilots to fly-by-wire systems. “My brother flies a 737. He loves it because it’s not fly-by-wire. There’s still mechanical systems in there. That trust in fly-by-wire still hasn’t percolated through much of the pilot community. I feel the medical community will need to feel the trust and [be able] to investigate the data they don’t agree with or when they don’t understand the output. AI is still seen as a black box.”

The emerging picture is one of accelerating AI adoption in biomedical research but a slowness in the clinic that seems unlikely to change near-term. The latter seems driven by still nascent AI expertise, computational infrastructure shortcomings, and FDA requirements for underlying hardware and software used in the clinic. Foertter offers two examples:

  • “One of the examples I like to use is Google’s project where they took fundus images – the back of the eye – and trained it to figure out certain things like cardiovascular risk and diabetes. They actually created an app out of the work and deployed it on their cloud. They collaborated with clinicians in Thailand who were enthusiastic, based on the initial results showing accuracy as high as 95% in diagnosis.  But two years later Google realized that the 95% accuracy in development dropped substantially in the field, because number one, the quality of the images that it was trained on was much higher than Thailand had available coming out of their imaging equipment, and number two, the infrastructure wasn’t available for them to upload to the cloud when they had these [high quality image data] available.”
  • The bulk of the [AI] talent and resources are located in Europe and US. These places have very stringent requirements, not in a negative sense but a safety and privacy sense, [for] hardware, software, and data. If you want to put a GPU, for example, in a medical instrument, which is an example that we had at Nvidia, by the time it goes to the FDA approval, the technology that was used to obtain regulatory approval was probably five years old? If you think about putting a 2015 GPU in the [approved] machine, that GPU number one is no longer in production, and number two the methodologies have changed.

Advances in AI technology and underlying methodologies are advancing too quickly for easy absorption into the clinic. Foertter and Berman both credit FDA efforts in trying to overcome some of these problems.

DOE’S TITAN SUPERCOMPUTER PRIMED THE AI PUMP

The reluctance to adopt AI in the clinic is completely reversed in biomedical research where there is something of a stampede building with enthusiasm for AI running rampant. Time will tell how the herd fares. BioTeam reports there’s a general lack of knowledge about basic AI-enabling technology and requirements along with uncertainty about picking projects.

Said Dagdigian, “Currently we are mostly enabling the ML/DL stuff from an IT infrastructure, you know, getting the systems up and running either on premise or in the cloud on helping them configure the storage, which needs to be higher performance. I will say that the people who can afford the DGXs seem to really love them, but other than that most of the people I’m working with are still on [CPU-centric systems]. They’re not picking the product as much they’re picking the toolkit or the environment. People are like, OK, we’re all in on TensorFlow and we’re going to start scaling up and training on that.”

Berman noted, “Some other things have started coming out like using AI frameworks to reduce the possibilities of drug targets [by assessing pocket binding characteristics matches] which is something that is being used massively at the moment in the coronavirus pandemic. That combined with electrodynamic simulations to verify that likelihood [of binding] is high and taking it into the lab to verify has the potential to reduce the time to drug development significantly, not to mention the cost.”

In terms of technology preferences, Berman said, “It really just depends on what the applications are. Most folks are just interested in having a small or large army of Nvidia Volta 100s in their arsenal and they can use several different generations of those just because that stuff gets a little expensive. Academics, who can rest on the non-production licenses, can use things like the Nvidia 1080 Ti or something like that. GPUs in general are still the preferred methodology for training, although there’s some recent innovations in the software stack.”

Foertter’s experience in high-end HPC, notably at ORNL, provides an interesting perspective on accelerator use (GPUs, specialty chips (e.g. Cerebras), FPGAs et al); this is especially so given Intel’s jump back into GPUs and AMD’s recent resurgence.

The Titan supercomputer. Image courtesy of OLCF.

“Why is AI so popular now? Where did it begin? This resurgence began primarily because of HPC leveraging GPUs for scientific purposes. The same accelerated linear algebra libraries that are underneath the frameworks being used in AI were built because of Titan (DOE’s first heterogeneous GPU-centric supercomputer, ~2012). They were built for HPC and some bright folks realized they could leverage these for deep learning and all of a sudden AI explodes in 2014 – 2015. So, we’ve only been five years at building these math libraries and AI frameworks,” said Foertter.

“Now fast forward a year, around  ~2016. Google says, well, we can probably build a special purpose TPU. They’re in the fourth generation of the TPU, and it still hasn’t quite taken off even within Google where there is still a dependence on GPUs. These specialty chips still require an underlying performant math library, and building these is not trivial. You can’t just kind of say, “Well, here’s a chip, it can do matrix multiply.” Okay, how do I integrate that into the framework? What libraries do I call?”

“I think there’s a good possibility that Intel, because Intel has had experience with building these HPC math libraries, can jump ahead even of AMD, even though AMD has also got the backing of DOE to create the same libraries for their GPU. Once those libraries exist for those newcomer GPUs, once AI frameworks have been built in a way that they can call any vendor’s version of BLAS for example, it’ll be fairly simple to run on emerging accelerators.

“I’m not sure that the special purpose stuff is really going to take off unless there’s somebody that really needs large models, very special models, you know, like the 3D convolutional nets; if a problem requires hardware that isn’t available from commodity options, it may be worth it for somebody to build it, sort of an FPGA style. But I don’t see those taking off and I don’t see a Google TPU taking off really, it’s too soon to tell,” said Foertter.

Not surprisingly the range of technology options available varies widely. DoE and NSF have big systems now and have been aggressively adding more (see NSF/PSC’s recent big bet on a Cerebras-based system). Big Pharma and big biotechs often have sophisticated systems, but mostly traditional CPU-centric architectures. Smaller biotechs and much of academia often have more limited computational resources. As a rule these days, a basic AI server has a couple of CPUs and 2-to-8 GPUs and virtually all major vendors have an AI line-up but even these are not cheap.

“The ability to put things on the cloud and burst to the cloud is a great place to try the new technologies and not have to worry about calling IT and racking it. I mean, I don’t know if you know about what it takes to rack up a DGX-2, but it takes a special team. It’s a big piece of equipment. You have to pay a team and they have to be certified to install that. It’s not something that you can kind of get off the shelf,” said Foertter. The cloud makes sense [when] there’s not a lot of IO, which is what usually costs a lot. If you’re in that scenario where there’s not a lot of IO, or you don’t have any other kind of legal constraints, where data can’t leave, or can’t stay, or HIPAA, the cloud is good place to get started.”

SHOULD AI’S HEAD BE IN THE CLOUDS?

 The big cloud providers have all been adding accelerators of one or another ilk and adding (or developing) AI frameworks and integration tools. Berman notes Amazon’s SageMaker (ML development and deployment) is well integrated with other AWS services. Google, he said, also has powerful capabilities though they are somewhat more complicated to use.

The STRIDES Initiative from NIH, begun in July 2018 with Google and Amazon, is intended to help provide discounted access to cloud-services and various data repositories. Its benefits, as described by NIH include:

  • Discounts on STRIDES¹ Initiative partner services – Favorable pricing on computing, storage, and related cloud services for NIH Institutes, Centers, and Offices (ICOs) and NIH-funded institutions.
  • Professional services – Access to professional service consultations and technical support from the STRIDES Initiative partners.
  • Training – Access to training for researchers, data owners, and others to help ensure optimal use of available tools and technologies.
  • Potential collaborative engagements – Opportunities to explore methods and approaches that may advance NIH’s biomedical research objectives (with scope and milestones of engagements agreed upon separately).
Ari Berman, BioTeam

Said Berman, “STRIDES has generated broad interest for good reasons. There have been successful pilots with large datasets and large data platforms that NIH curates utilizing both GCP and AWS, and those have been really big wins. You know, STRIDES has been an integral part of the COVID-19 response that is going on right now and will continue to be. So I’d say as an initial release, especially for sophisticated groups, it’s been very positive. I think what’s coming down [in the future] is how to make it more accessible to people who don’t know how to operate clouds, and to implement platforms and things like that.”

Access to AI technology isn’t really the problem suggests BioTeam. It’s access to curated and integrated data that is a stumbling block. That and picking as starting (pilot) project. This is true even in big companies.

“Pharma companies generate tons of data. Some have been in the business for 40 years, like Johnson and Johnson or GSK. Along the way they’ve purchased 10, 15, 20 different Pharmas, and you’d be surprised at how many of those different acquired databases have never been integrated and the data are not harmonized. It’s amazing data, but it’s never been brought together. How do you use that? You need somebody that understands all these databases.”

Berman agreed. “As Fernanda laid out, data are so siloed. There are standards; there’s just 37 of them. Right? And they don’t talk to each other. There are entire consortia trying to figure this out. They’re trying to reduce the 37 to like eight, which still is too many. What I think is actually going to happen is that some of the existing standards are going to be come out stronger than others. Those standards are going to be used to create individual platforms that don’t talk to each other that are still siloed, so it has the effect of coalescing some of the data, but you still can’t mix the platforms. That’s what I think will happen. That also will enable a greater degree of AI and analytics to come in, but it’s still not going to realize the ability to query across all the data.”

Given the still fragmented and nascent world of AI, just getting started is currently the biggest challenge for many in commercial and academic life science research. Clinicians are a step or two behind them. While that sounds a bit formulaic and consultant-speak(ish), it also has the subdued ring of current reality. AI is still poorly defined. Machine learning and deep learning are concrete. But tools (hardware and s/w) for making practical use of ML and DL are emerging and now is likely the time to test the waters.

Said Foertter, “Realistically, they should be trying to do something. A lot of folks just don’t know where to start. They know that AI can bring them some value. They know they would like to explore to see if it could bring them value. But a lot of folks say they are just kind of waiting for somebody to sell them something. Here’s something you can put your data in and outcomes magic. That’s not what where are right now. That’s still probably another five years away to get to the point where people are going to be selling these sorts of ready-to-go, train-your-own solutions.”

BioTeam’s advice is to start small – a small pilot for which you have the data available and you have people that understand those data. Start with a small question, using a cloud resource or if you have adequate IT internal resources, and see where it goes.

Link to part one (HPC in Life Sciences 2020: the Rise of AMD, Taming Data Management’s Wild West), https://www.hpcwire.com/2020/05/20/hpc-in-life-sciences-2020-part-1-rise-of-amd-data-managements-wild-west-more/

Footnote 1 STRIDES is the NIH Science and Technology Research Infrastructure for Discovery, Experimentation, and Sustainability,  https://datascience.nih.gov/strides

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Congressional Hearing on U.S. National Quantum Initiative Reauthorization Set for this Week

June 5, 2023

On Wednesday of this week the House Science Committee will hold a hearing as part of the reauthorization effort for the U.S. National Quantum Initiative Act passed in 2018. In recent years, the global race to achieve qua Read more…

Researchers Develop Integrated Photonic Platform Based on Thin-Film Lithium Niobate

June 3, 2023

Researchers are leveraging photonics to develop and scale the hardware necessary to tackle the stringent requirements of quantum information technologies. By exploiting the properties of photonics, researchers point to t Read more…

ASC23: Application Results

June 2, 2023

The ASC23 organizers put together a slate of fiendishly difficult applications for the students this year. The apps were a mix of traditional HPC packages, like WRF-Hydro and FVCOM, plus machine learning centric programs Read more…

Q&A with Marco Pistoia, an HPCwire Person to Watch in 2023

June 2, 2023

HPCwire Person to Watch Marco Pistoia wears a lot of hats at JPMorgan Chase & Co.: managing director, distinguished engineer, head of global technology applied research and head of quantum computing. That work with J Read more…

HPC Career Notes: June 2023 Edition

June 1, 2023

In this monthly feature, we’ll keep you up-to-date on the latest career developments for individuals in the high-performance computing community. Whether it’s a promotion, new company hire, or even an accolade, we’ Read more…

AWS Solution Channel

Shutterstock 1493175377

Introducing GPU health checks in AWS ParallelCluster 3.6

GPU failures are relatively rare but when they do occur, they can have severe consequences for HPC and deep learning tasks. For example, they can disrupt long-running simulations and distributed training jobs. Read more…

 

Shutterstock 1415788655

New Thoughts on Leveraging Cloud for Advanced AI

Artificial intelligence (AI) is becoming critical to many operations within companies. As the use and sophistication of AI grow, there is a new focus on the infrastructure requirements to produce results fast and efficiently. Read more…

Intersect360: HPC Market ‘Returning to Stable Growth’

June 1, 2023

The folks at Intersect360 Research released their latest report and market update just ahead of ISC 2023, which was held in Hamburg, Germany, last week. The headline: “We’re returning to stable growth,” per Addison Read more…

Intersect360: HPC Market ‘Returning to Stable Growth’

June 1, 2023

The folks at Intersect360 Research released their latest report and market update just ahead of ISC 2023, which was held in Hamburg, Germany, last week. The hea Read more…

Lori Diachin to Lead the Exascale Computing Project as It Nears Final Milestones

May 31, 2023

The end goal is in sight for the multi-institutional Exascale Computing Project (ECP), which launched in 2016 with a mandate from the Department of Energy (DOE) Read more…

At ISC, Sustainable Computing Leaders Discuss HPC’s Energy Crossroads

May 30, 2023

In the wake of SC22 last year, HPCwire wrote that “the conference’s eyes had shifted to carbon emissions and energy intensity” rather than the historical Read more…

Nvidia Announces Four Supercomputers, with Two in Taiwan

May 29, 2023

At the Computex event in Taipei this week, Nvidia announced four new systems equipped with its Grace- and Hopper-generation hardware, including two in Taiwan. T Read more…

Nvidia to Offer a ‘1 Exaflops’ AI Supercomputer with 256 Grace Hopper Superchips

May 28, 2023

We in HPC sometimes roll our eyes at the term “AI supercomputer,” but a new system from Nvidia might live up to the moniker: the DGX GH200 AI supercomputer. Read more…

Closing ISC Keynote by Sterling and Suarez Looks Backward and Forward

May 25, 2023

ISC’s closing keynote this year was given jointly by a pair of distinguished HPC leaders, Thomas Sterling of Indiana University and Estela Suarez of Jülich S Read more…

The Grand Challenge of Simulating Nuclear Fusion: An Overview with UKAEA’s Rob Akers

May 25, 2023

As HPC and AI continue to rapidly advance, the alluring vision of nuclear fusion and its endless zero-carbon, low-radioactivity energy is the sparkle in many a Read more…

MareNostrum 5 Hits Speed Bumps; Iconic Chapel to Host Quantum Systems

May 23, 2023

MareNostrum 5, the next-generation supercomputer at the Barcelona Supercomputing Center (BSC) and one of EuroHPC’s flagship pre-exascale systems, has had a di Read more…

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

Leading Solution Providers

Contributors

CORNELL I-WAY DEMONSTRATION PITS PARASITE AGAINST VICTIM

October 6, 1995

Ithaca, NY --Visitors to this year's Supercomputing '95 (SC'95) conference will witness a life-and-death struggle between parasite and victim, using virtual Read more…

SGI POWERS VIRTUAL OPERATING ROOM USED IN SURGEON TRAINING

October 6, 1995

Surgery simulations to date have largely been created through the development of dedicated applications requiring considerable programming and computer graphi Read more…

U.S. Will Relax Export Restrictions on Supercomputers

October 6, 1995

New York, NY -- U.S. President Bill Clinton has announced that he will definitely relax restrictions on exports of high-performance computers, giving a boost Read more…

Dutch HPC Center Will Have 20 GFlop, 76-Node SP2 Online by 1996

October 6, 1995

Amsterdam, the Netherlands -- SARA, (Stichting Academisch Rekencentrum Amsterdam), Academic Computing Services of Amsterdam recently announced that it has pur Read more…

Cray Delivers J916 Compact Supercomputer to Solvay Chemical

October 6, 1995

Eagan, Minn. -- Cray Research Inc. has delivered a Cray J916 low-cost compact supercomputer and Cray's UniChem client/server computational chemistry software Read more…

NEC Laboratory Reviews First Year of Cooperative Projects

October 6, 1995

Sankt Augustin, Germany -- NEC C&C (Computers and Communication) Research Laboratory at the GMD Technopark has wrapped up its first year of operation. Read more…

Sun and Sybase Say SQL Server 11 Benchmarks at 4544.60 tpmC

October 6, 1995

Mountain View, Calif. -- Sun Microsystems, Inc. and Sybase, Inc. recently announced the first benchmark results for SQL Server 11. The result represents a n Read more…

New Study Says Parallel Processing Market Will Reach $14B in 1999

October 6, 1995

Mountain View, Calif. -- A study by the Palo Alto Management Group (PAMG) indicates the market for parallel processing systems will increase at more than 4 Read more…

ISC 2023 Booth Videos

Cornelis Networks @ ISC23
Dell Technologies @ ISC23
Intel @ ISC23
Lenovo @ ISC23
Microsoft @ ISC23
ISC23 Playlist
  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire