ISC Keynote: The Algorithms of Life – Scientific Computing for Systems Biology

By John Russell

June 19, 2019

Systems biology has existed loosely under many definitions for a couple of decades. It’s the notion of describing living systems using first-principle physics and mathematics to capture life in equations that are both descriptive and predictive – and let’s add productive by which we mean being able to deliver therapies (drugs et. al) to enhance health and fight disease.

Doing that has proven difficult at best and disappointing at worst as even a cursory glance at the state of healthcare reveals; that’s notwithstanding many marvelous breakthroughs such as sequencing the human genome and the steady chipping away at functional genomics (and other ‘omics) to understand better how DNA informs what we become.

Ivo Sbalzarini

With apologies to ISC organizers I’ve stolen the name of the opening keynote by Ivo Sbalzarini –  The Algorithms of Life – Scientific Computing for Systems Biology – for the headline of this article in an attempt to capture his expansive presentation. Thanks also to Sbalzarini for providing a few of his slides.

Given all we know today and the steady gush of experimental data from modern instruments, what we are missing, said Sbalzarini, are the algorithms to make sense of it all. Having poked away at this problem for nearly as long as it has been around, Sbalzarini presented a sweeping approach to digging out those algorithms by capitalizing on recent advances in imaging technology, immersive virtual/augmented reality, a sophisticated analysis approach that leverages particle-mesh mathematics and which has been built into a software platform (OpenFPM), and lastly, no surprise, the steadily growing power of HPC.

 

As in many important life sciences advances the ‘lowly’ fruit fly took center stage. In this instance the analysis was to investigate a dysregulation in embryogenesis – specifically the failure of tissue to fold properly. In the end, the researchers identified the DNA influence, the chemical environment influence, and the mechanical environment influence, and delivered a predictive understanding of the embryo’s tissue response. Lest you think this is old work, it was presented last week at the New York Scientific Data Summit.

Getting from Sbalzarini’s early nascent research 15 years ago to the impressive results (and tool suite) presented is a long journey. We’ll summarize as practical but the ISC is likely to archive its keynote; for biologists it is well worth watching.

Advanced imaging, such as light sheet microscopy, now makes it possible to observe life science phenomena in 3D and great detail at the cellular and intracellular level.

“We can image an embryo from the time it is a fertilized to the time it moves out of the microscope field by itself and continues its life. When we image the fruit fly embryo over the 72 hours of development, we gather 180 TB of image data. If you would like to visualize that in real-time. That means a rendering performance or a rendering throughput in real-time of about 1.8 Gigapixels per second,” said Sbalzarini[i]. A key advantage here is the animal stays alive unlike older approaches requiring stains and fixing.

Hardly just pretty pictures, the extensive image data captured (and visualizations possible) are the raw input for building hypotheses and predictive models. The other primary driver is Sbalzarini clever adaption of particle-mesh technology to convert the data into actionable, in silico simulation. Underlying HPC infrastructure, of course, is the engine without which the whole process would grind to a halt.

“The numerical methods are particle methods or hybrid particle mesh methods. They comprise an interesting class of numerical methods. They discretize the system by particles, so if you have a complex geometry, you don’t need to generate the mesh for the simulation, but you simply fill the geometry with particles that store the variables; there can be a mesh in addition in order to do far field equations in order to compute for example forces for far field equations, for example,” he said.

“This is a classic framework of particle-mesh methods to solve partial differential equations, but particle methods as an algorithm are much more general than that. I would define everything as a particle method that is composed of dots of zero dimension elements that are characterized by a position in some space and some properties that they carry. Such an algorithm can be used to solve partial differential equations where the particles are the colocation points of your various discretization and they store the values of the field at that position.”

He adds quickly, “There is nothing that limits us to having particles interacting in a deterministic fashion and this then also allows us to solve stochastic different equations, numerically or to perform agent based simulation or agent based modeling.”

Building the computational tools to deliver these models has been a challenging and lengthy task for which Sbalzarini is well-qualified. He is the chair of scientific computing for systems biology on the faculty of computer science of TU Dresden, as well as the faculty of mathematics, and director of the TUD-Department in the Center for Systems Biology Dresden. He also is a permanent Senior Research Group Leader with the Max Planck Institute of Molecular Cell Biology and Genetics in Dresden.

Leaving out many details and with regrets for over-simplification, Sbalzarini and colleagues imaged the fruit fly embryo; used machine learning to identify ‘algorithms’, converted the data and algorithms into models based on particle-mesh approaches using their home-developed platform; ran computational experiments to test their hypotheses; and used immersive visualization technology as a step to allow researchers to see the real process and simulations unfold. “It is possible to walk around inside the simulation,” he said. Informed by what they saw and their knowledge, researchers tweaked parameters and hypotheses, iteratively converging on a solution.

“To me it is a very nice example of how HPC and these numerically intricate simulations that we can do with these machines allow us to bridge really from the molecular scale to the tissue scaler in order to explain how things work and in order to propose remedies,” said Sbalzarini.

Sbalzarini reminded the audience living systems are computing machines themselves, “[A fruit fly embryo] is a massively parallel and fully self-organized system in which we can view every single cell as a processing element that executes programs. [It’s a] highly interconnected computer and able to solve NP hard problems with billions or hundreds of billions of processing elements. We know a lot about the hardware of this computer – the proteins, the molecules, the lipids, the fats out of which this computer is made – and thanks to sequencing technology, [we’re] able to read the source code of this computer, which is the genomic sequence. However we have no idea what algorithms this source code implements on his hardware.”

Now, advanced imaging and machine learning capabilities are catalyzing researchers’ ability to identify ‘mechanistic’ guidelines and incorporate traditional formulations (ODEs/PDEs) of physics laws and mathematics into the life sciences tool box. Chemical diffusion. Fluid dynamics. EMI influences. Activation energy thresholds. These are the kinds of attributes that can be captured in particle-mesh models.

When Sbalzarini began his studies in earnest, he used an NEC SX-5 with 512 processors housed at CSCS (Swiss Supercomputer Center). In 2005 that became a Cray XT-3 with 1664 processors. A lot has changed since. The first iteration of the system biology software platform his team developed was Parallel Particle Mesh Library (PPM) written in Fortran 90 many years ago.  It served as layer between MPI and Client Applications for simulations of physical systems using Particle-Mesh methods. The PPM library runs on single and multi-processor architectures, and handles 2D and 3D problems.

“The PPM library had two parts, what we call the PPM core, which is implemented in all the communication primitives, the load balancing, the file IO, [and] the distributed data structures. And the PPM numerics using frequently used numerical solvers; it does this in part by using the abstractions from the core and in part by renting third party libraries such as PETSc or FFTW. On top of PPM there is a domain specific programming language called PPM Language which provides a reasonably simple way of coding PPM but you could also directly interface with the Fortran API,”

PPML used overloading and generic interfaces and provided for the limitations of the important routines for different hardware platforms such as vector processors, like the NEC system, shared memory, distributed memory, even single processor systems, said Sbalzarini.

It was a beast to maintain. “Because of overloading the amount of source code in the PPM library was huge, several millions of lines of code that needed to be maintained here and ported. What we liked about PPML was the abstraction on which it is based. It’s a set of abstract data types and abstract operators for computing that are in our opinion the most coarse-grained abstractions possible that still cleanly separate computation from communication. So in PPM an abstraction would either only compute but not incur any communication overhead or it would only communicate but not do any computation,” he said.

Five years ago the platform was upgraded, “We decided to keep the abstractions, to keep the definitions of the data types and the operators, but now implement a C++ library which is called OpenFPM (Open Framework for Particle Method Library) and make use of template metaprogrammingin C++ for compiled time code generation. OpenFPM can do much more than PPM, for example it can do simulations in arbitrary dimensional spaces where PPM is limited to 2D and 3D. OpenFPM allowed particle properties to be objects of any C++ that the user can define and all the communication and file IO will work for it,” he said.

Adopting template metaprogramming reduced the amount of code needed to “about a factor of ten less complexity than the PPM.”

Sbalzarini presented many more details in his rich talk. It will be interesting to watch how widely OpenFPM is used and if it gains tractions in other domains. Ease of use is a key question for many biomedical researchers and clinicians. Sbalzarini said, “This hopefully makes HPC so easy to use that every science-based application in biology, in computational biology, and also in other fields can benefit.”

That said computer expertise, particularly HPC expertise, has historically been lacking in life science although that is changing and fairly quickly.

The main motivation is to understand biology and to understand how cells form tissues, and eventually to be able to provide novel explanations for disease phenotypes and maybe therapies for disease, said Sbalzarini. Nevertheless, “For us as computer scientists it’s also just a lot of fun because what we do combines several technologies that we think are fun to work with, technologies like virtual reality, HPC, massively scalable software systems, building microscopes and playing with optics, or using and developing artificial intelligence and learning algorithms to interface with the living things in the microscope.”

[i]Some quotes have been very lightly edited to improve readability.

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, and this day of contemplation is meant to provide all of us Read more…

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

April 22, 2024

As we find ourselves on the brink of a technological revolution, the need for efficient and sustainable computing solutions has never been more critical.  A computer system that can mimic the way humans process and s Read more…

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

2024 Winter Classic: The Return of Team Fayetteville

April 18, 2024

Hailing from Fayetteville, NC, Fayetteville State University stayed under the radar in their first Winter Classic competition in 2022. Solid students for sure, but not a lot of HPC experience. All good. They didn’t Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Leading Solution Providers

Contributors

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire