Cancer Research: A Supercomputing Perspective

By Aaron Dubrow

May 31, 2017

Cancer, the second-leading cause of death in the U.S. after heart disease, kills more than 500,000 citizens per year, including about 2,000 children.

In 2016, then Vice President Joe Biden launched the Cancer Moonshot, saying: “I know that we can help solidify a genuine global commitment to end cancer as we know it today —  and inspire a new generation of scientists to pursue new discoveries and the bounds of human endeavor.”

The importance of high performance computing (HPC) in cancer research was recognized by the Cancer Moonshot Task Force report, and by then Vice President Joe Biden and Energy Secretary Ernie Monitz.

“Supercomputers are key to the Cancer Moonshot,” Monitz wrote. “These exceptionally high-powered machines have the potential to greatly accelerate the development of cancer therapies by finding patterns in massive datasets too large for human analysis. Supercomputers can help us better understand the complexity of cancer development, identify novel and effective treatments, and help elucidate patterns in vast and complex data sets that advance our understanding of cancer.”

With complex, non-linear signaling networks, multiscale dynamics from the quantum to the macro level, and giant, complex datasets of patient responses, cancer is quite possibly the ultimate in HPC problems.

“What could be more complicated and more important?” said J. Tinsley Oden, a computational researcher at The University of Texas at Austin applying uncertainty quantification to cancer treatment predictions. “At each step, it has the most complex features. It is really a garden of rich, important problems that are in the path of many of the developments that we’ve been working on for years.”

Infographic depicts TACC’s multi-domain approach to fighting cancer — click to expand

Hundreds of oncologists, biologists and computer scientists use the HPC systems at the Texas Advanced Computing Center (TACC) to understand the fundamental nature of cancer biology and to improve cancer treatments. Their work addresses a range of cancers types and treatment modalities, and spans applied or fundamental research.

Though diverse in their specific targets, the approaches they use can be loosely grouped into seven broad methodologies: molecular simulation; bioinformatics; mathematical modeling; computational treatment planning; quantum calculation; clinical trial design; and machine learning. The following sections describe and provide examples of each.

Molecular Simulations

Simulating protein and drug interactions at the molecular level enables scientists to understand the mechanics of cancer to design more effective treatments.

For Rommie Amaro, professor of Chemistry and Biochemistry at the University of California, San Diego, this means uncovering new pockets in tumor protein 53 (p53) — “the guardian of the genome” — which plays a crucial role in conserving the stability of DNA and preventing mutations.

The model of full-length p53 protein bound to DNA as a tetramer. The surface of each p53 monomer is depicted with a different color. [Courtesy: Özlem Demir, University of California, San Diego]
In approximately 50 percent of all human cancers, p53 is mutated and rendered inactive, therefore, reactivating mutant p53 using small molecules has been a long-sought-after anticancer therapeutic strategy.

In September 2016, writing in the journal Oncogene, Amaro reported results of the largest atomic-level simulation of the p53 to date — comprising more than 1.5 million atoms. The simulations, enabled by the Stampede supercomputer at TACC, helped identify new binding sites on the surface of the protein that could potentially reactivate p53.

“When most people think about cancer research they probably don’t think about computers,” she said. “But biophysical models are getting to the point where they have a great impact on the science.”

Virtual drug screening is another important HPC application for cancer research. Shuxing Zhang, professor of experimental therapeutics at MD Anderson Cancer Center, used molecule simulations on TACC’s Lonestar5 system to screen 1,448 Food and Drug Administration-approved small molecule drugs to determine which had the molecular features needed to bind and inhibit TNIK — an enzyme that plays a key role in cell signaling in colon cancer.

Zhang discovered that mebendazole, an FDA-approved drug that fights parasites, could effectively bind to TNIK and inhibit its enzymatic activity. He reported his results in Nature Scientific Reports in September 2016.

“Such advantages render the possibility of quickly translating the discovery into a clinical setting for cancer treatment in the near future,” Zhang wrote.

Bioinformatics

The human genome consists of three billion base pairs, so identifying single mutations by sight simply isn’t possible. For that reason, the field of bioinformatics — which uses computing and software to identify patterns and differences in biological data — has been an enormous boon for cancer researchers.

But bioinformatics is more than simple, one-to-one pattern matching.

A heat map showing differences in gene expression between primary tumors and cultured cell lines. Each row is a gene and each column is a tumor or cell sample. In the heat map, red indicates high expression and blue indicates low expression. NHA refers to normal human astrocytes, a star-shaped glial cell of the central nervous system. [Courtesy: Amelia Weber Hall, Iyer lab]
“When you move into multi-dimensional, time-series, or population-level studies, the algorithms can get a lot more computationally intensive,” said Matt Vaughn, TACC’s Director of Life Sciences Computing. “This requires resources like those at TACC, which help large numbers of researchers explore the complexity of cancer genomes by providing elastic, large-scale computing capability.”

For Vishy Iyer, a molecular biologist at The University of Texas at Austin (UT Austin), and his collaborators, access to TACC’s Stampede supercomputer helps them mine reams of data from The Cancer Genome Atlas to identify genetic variants and subtle correlations that affect gene expression in tumors.

“TACC has been vital to our analysis of cancer genomics data, both for providing the necessary computational power and the security needed for handling sensitive patient genomic datasets,” Iyer said.

In February 2016, Iyer and a team of researchers from UT Austin and MD Anderson Cancer Center reported in Nature Communications on a genome-wide transcriptome analysis of the two types of cells that make up the prostate gland. They identified cell-type-specific gene signatures that were associated with aggressive subtypes of prostate cancer and adverse clinical responses.

“This knowledge can be helpful in the development of more targeted therapies that seek to eliminate cancer at its origin,” Iyer said.

Using a similar methodology, Iyer and a team of researchers from UT Austin and the National Cancer Institute identified a transcription factor associated with an aggressive type of lymphoma that is highly correlated with poor therapeutic outcomes. They published their results in the Proceedings of the National Academy of Sciences in January 2016.

Whereas Iyer, an experienced HPC user, develops custom tools for his analyses, a much larger number of researchers access Stampede and comparable systems through scientific gateways. One prominent gateway is Galaxy, an open source bioinformatics platform that serves 30,000 researchers and runs more than 3,000 compute jobs a day.

Since 2014, TACC has powered the data analyses for a large percentage of Galaxy users, allowing researchers to solve tough problems in cases where their personal computer or campus cluster is not sufficient. Of those researchers, a significant subset use the site to analyze cancer genomes.

“Galaxy can be used to identify tumor mutations that drive cancer growth, find proteins that are overexpressed in a tumor, as well as for chemo-informatics and drug discovery,” said Jeremy Goecks, Assistant Professor of Biomedical Engineering and Computational Biology at Oregon Health and Science University and one of Galaxy’s principal investigators.

Goecks estimates that hundreds of researchers each year use the platform for cancer research, himself included. Because cancer patient data is closely protected, the bulk of this usage involves either publically available cancer data, or data on cancer cell lines – immortalized cells that reproduce in the lab and are used to study how cancer reacts to different drugs or conditions.

“This is an ideal marriage of TACC having tremendous computing power with scalable architecture and Galaxy coming along and saying, we’re going to go the last mile and make sure that people who can’t normally use this hardware are able to.”

Mathematical Modeling

While some researchers believe bioinformatics will rapidly advance the understanding and treatment of cancer, others think a better approach is to mathematize cancer: to uncover the fundamental formulas that represent how cancer, in its varied forms, behaves.

At the Center for Computational Oncology at UT Austin, researchers are developing complex computer models to predict how cancer will progress in a specific individual.

Each factor involved in the tumor response — whether it is the speed with which chemotherapeutic drugs reach the tissue or the degree to which cells signal each other to grow — is characterized by a mathematical equation that captures its essence. These models are combined and parameterized and initialized with patient-specific data.

In April 2017, writing in the Journal of The Royal Society Interface, Thomas Yankeelov and collaborators at UT Austin and Vanderbilt University, showed that they can predict how brain tumors (gliomas) will grow in mice with greater accuracy than previous models by including factors like the mechanical forces acting on the cells and the tumor’s cellular heterogeneity.

To develop and implement their mathematically complex models, the center’s scientists use TACC’s supercomputers, which enable them to solve bigger problems that they otherwise could and reach solutions far faster.

Recently, the group has begun a clinical study to predict, after one treatment, how an individual’s cancer will progress, and use those predictions to plan the future course of treatment.

“There are not enough resources or patients to sort this problem out because there are too many variables. It would take until the end of time,” Yankeelov said. “But if you have a model that can recapitulate how tumors grow and respond to therapy, then it becomes a classic engineering optimization problem. ‘I have this much drug and this much time. What’s the best way to give it to minimize the number of tumor cells for the longest amount of time?’”

Computing at TACC helps Yankeelov accelerate his research. “We can solve problems in a few minutes that would take us three weeks to do using the resources at our old institution,” he said. “It’s phenomenal.”

Quantum Calculations

X-ray radiation is the most frequently used form of radiation therapy, but a new treatment is emerging that uses a beam of protons to kill cancer cells with minimum damage on surrounding tissues.

“As happens in cancer therapy, we know empirically that it works, but we don’t know why,” said Jorge A. Morales, a professor of chemistry at Texas Tech University and a leading proponent of the computational analysis of proton therapy. “To do experiments with human subjects is dangerous, so the best way is through computer simulation.”

Computational experiments can mimic the dynamics of the proton-cell interactions without causing damage to a patient and can reveal what happens when the proton beam and cells collide from start to finish, with atomic-level accuracy. Morales has been simulating proton-cell chemical reactions using quantum dynamics models on TACC’s Stampede supercomputer to investigate the fundamentals of the process.

His studies, reported in PLOS One in March 2017, as well as in Molecular Physics, and Chemical Physics Letters (2015 and 2014 respectively), have determined the basic byproducts of protons colliding with water within the cell, and with nucleotides and clusters of DNA bases – the basic units of DNA. The studies shed light on how the protons and their water radiolysis products damage DNA.

Though fundamental in nature, the insights and data that Morales’ simulations produce help researchers understand proton cancer therapy at the quantum level, and help modulate factors like dosage and beam direction.

“These simulations will bring about a unique way to understand and control proton cancer therapy that, at a very low cost, will help to drastically improve the treatment of cancer patients without risking human subjects,” Morales said.

Computational Treatment Planning

Wei Liu, a researcher at the Mayo Clinic, also studies proton therapy, but he looks at the treatment from a clinical perspective.

In comparison with current radiation procedures, proton therapy saves healthy tissue in front of and behind the tumor. It is particularly effective when irradiating tumors near sensitive organs where stray beams can be particularly damaging.

However, the pinpoint accuracy required by the protein beam, which is its greatest advantage, means that it must be precisely calibrated and that discrepancies from the ideal (whether from device, human error or even patient breathing) must be taken into consideration.

Writing in Medical Physics in January 2017, Liu and his collaborators showed that their “chance-constrained model” was better at sparing organs at risk than current methods.

“Each time, we try to mathematically generate a good plan,” he said. “There are 25,000 variables or more, so generating a plan that is robust to these mistakes and can still get the proper dose distribution to the tumor is a large-scale optimization problem.”

The researchers used the Lonestar5 supercomputer at TACC to generate treatment plans that minimize the risk and uncertainties involved in proton beam therapy.

“It’s very computationally expensive to generate a plan in a reasonable timeframe,” he continued. “Without a supercomputer, we can do nothing.”

Computational Trial Design

Another way researchers use TACC’s advanced computers is to design clinical trials that can better determine which combination of dosages will be most effective, specifically for the biological agents used in immunotherapy, which work very differently from chemotherapy and radiation.

Writing in the Journal of the Royal Statistics Society Series C (Applied Statistics), Chunyan Cai, assistant professor of biostatistics at McGovern Medical School at The University of Texas Health Science Center at Houston (UTHealth) described her efforts using Lonestar5 to identify biologically optimal dose combinations for agents that target the PI3K/AKT/mTOR signaling pathway, which has been associated with several genetic aberrations related to the promotion of cancer.

Scanning electron micrograph of a human T lymphocyte (also called a T cell) from the immune system of a healthy donor. Immunotherapy fights cancer by supercharging the immune system’s natural defenses (include T-cells) or contributing additional immune elements that can help the body kill cancer cells. HPC is helping researchers better understand how immunotherapeutic agents can be used effectively [Courtesy: NIAID]
“Our research is motivated by a drug combination trial at MD Anderson Cancer Center for patients diagnosed with relapsed lymphoma,” Cai said. “The trial combined two novel biological agents that target two different components in the PI3K/AKT/mTOR signaling pathway.”

They investigated six different dose-toxicity and dose-efficacy scenarios and carried out 2,000 simulated trials for each of the designs.

Based on those simulations, she concluded that “the design proposed has desirable operating characteristics in identifying the biologically optimal dose combination under various patterns of dose–toxicity and dose–efficacy relationships.”

The research is leading to new, safer and more effective ways to test combinations of immunotherapeutic agents.

Machine Learning

A final, and truly radical, way that researchers are using HPC for cancer research is through the application of machine and deep learning.

The Eberlin research group at UT Austin develops clinical applications of ambient mass spectrometry for cancer diagnosis. They create tools and techniques to assist surgeons in distinguishing between normal and cancer tissue during tumor resection operations.

To do so, they have had to develop statistical methods that can analyze and interpret large amount of mass spectrometry data gathered from clinical samples.

Jonathan Young, a post-doctoral research in the group, is building machine learning classifiers to reliably predict whether a given tissue sample is cancer or normal, and if it is indeed cancer, which specific subtype the tumor belongs to.

Young uses the Maverick system at TACC, which contains a large number of NVIDIA GPUs, to develop and implement the machine learning algorithms. “The large memory capacity of Maverick is well suited for our extensive datasets, and the parallelization capability will aid in parameter sweeps during the training of classifiers,” Young said.

Young will present his work at the American Society for Mass Spectrometry (ASMS) Annual Conference this June.

Another example of the application of machine learning to cancer can be found in the work of Daniel Lobo, an assistant professor of biology and computer science at the University of Maryland, Baltimore County (UMBC). He is using machine learning to map out the cellular communication networks that underlie cancer, and to design methods to disrupt them.

In their January 2017 paper in Scientific Reports, Lobo and collaborators showed that machine learning can uncover the cellular networks that determine pigmentation in tadpoles and reverse-engineering never-before-seen coloration. Their work was facilitated by Stampede, which enabled the team to run billions of simulations to identify models of the cellular network and the means of altering it.

Lobo’s lab is applying the method to cancer research to determine what type of interventions might stop metastasis in its tracks without damaging other cells.

“Traditional approaches like chemotherapy attack the cells that grow the most, but leave cells that are signaling others to grow and that may be the most important,” Lobo says. “We’re using machine learning to find out the communication networks between these cells and hopefully to discover a treatment that can cause the tumor to collapse.”

“Getting a true understanding, given the complexity of the information, without some assistance from machine learning, is probably hopeless,” said Michael Levin, Lobo’s collaborator. “I think it’s inevitable that we use machine learning to enrich scientific and biomedical discovery.”

From patient-specific treatments to immunology to drug discovery, advanced computing is accelerating the basic and applied science underlying our understanding of cancer and the development and application of cancer treatments.

If scientists are the rocket in the cancer moonshot, HPC processing power is the jet fuel.

About the Author

Aaron Dubrow joined TACC in October 2007 as the Science and Technology Writer with the responsibility of reporting on the myriad of research and development projects undertaken by TACC.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Researchers Scale COSMO Climate Code to 4888 GPUs on Piz Daint

October 17, 2017

Effective global climate simulation, sorely needed to anticipate and cope with global warming, has long been computationally challenging. Two of the major obstacles are the needed resolution and prolonged time to compute Read more…

By John Russell

Student Cluster Competition Coverage New Home

October 16, 2017

Hello computer sports fans! This is the first of many (many!) articles covering the world-wide phenomenon of Student Cluster Competitions. Finally, the Student Cluster Competition coverage has come to its natural home: H Read more…

By Dan Olds

UCSD Web-based Tool Tracking CA Wildfires Generates 1.5M Views

October 16, 2017

Tracking the wildfires raging in northern CA is an unpleasant but necessary part of guiding efforts to fight the fires and safely evacuate affected residents. One such tool – Firemap – is a web-based tool developed b Read more…

By John Russell

HPE Extreme Performance Solutions

Transforming Genomic Analytics with HPC-Accelerated Insights

Advancements in the field of genomics are revolutionizing our understanding of human biology, rapidly accelerating the discovery and treatment of genetic diseases, and dramatically improving human health. Read more…

Exascale Imperative: New Movie from HPE Makes a Compelling Case

October 13, 2017

Why is pursuing exascale computing so important? In a new video – Hewlett Packard Enterprise: Eighteen Zeros – four HPE executives, a prominent national lab HPC researcher, and HPCwire managing editor Tiffany Trader Read more…

By John Russell

Student Cluster Competition Coverage New Home

October 16, 2017

Hello computer sports fans! This is the first of many (many!) articles covering the world-wide phenomenon of Student Cluster Competitions. Finally, the Student Read more…

By Dan Olds

Intel Delivers 17-Qubit Quantum Chip to European Research Partner

October 10, 2017

On Tuesday, Intel delivered a 17-qubit superconducting test chip to research partner QuTech, the quantum research institute of Delft University of Technology (TU Delft) in the Netherlands. The announcement marks a major milestone in the 10-year, $50-million collaborative relationship with TU Delft and TNO, the Dutch Organization for Applied Research, to accelerate advancements in quantum computing. Read more…

By Tiffany Trader

Fujitsu Tapped to Build 37-Petaflops ABCI System for AIST

October 10, 2017

Fujitsu announced today it will build the long-planned AI Bridging Cloud Infrastructure (ABCI) which is set to become the fastest supercomputer system in Japan Read more…

By John Russell

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Intel Debuts Programmable Acceleration Card

October 5, 2017

With a view toward supporting complex, data-intensive applications, such as AI inference, video streaming analytics, database acceleration and genomics, Intel i Read more…

By Doug Black

OLCF’s 200 Petaflops Summit Machine Still Slated for 2018 Start-up

October 3, 2017

The Department of Energy’s planned 200 petaflops Summit computer, which is currently being installed at Oak Ridge Leadership Computing Facility, is on track t Read more…

By John Russell

US Exascale Program – Some Additional Clarity

September 28, 2017

The last time we left the Department of Energy’s exascale computing program in July, things were looking very positive. Both the U.S. House and Senate had pas Read more…

By Alex R. Larzelere

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in Read more…

By Tiffany Trader

Leading Solution Providers

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

IBM Advances Web-based Quantum Programming

September 5, 2017

IBM Research is pairing its Jupyter-based Data Science Experience notebook environment with its cloud-based quantum computer, IBM Q, in hopes of encouraging a new class of entrepreneurial user to solve intractable problems that even exceed the capabilities of the best AI systems. Read more…

By Alex Woodie

Intel, NERSC and University Partners Launch New Big Data Center

August 17, 2017

A collaboration between the Department of Energy’s National Energy Research Scientific Computing Center (NERSC), Intel and five Intel Parallel Computing Cente Read more…

By Linda Barney

  • arrow
  • Click Here for More Headlines
  • arrow
Share This