Cancer Research: A Supercomputing Perspective

By Aaron Dubrow

May 31, 2017

Cancer, the second-leading cause of death in the U.S. after heart disease, kills more than 500,000 citizens per year, including about 2,000 children.

In 2016, then Vice President Joe Biden launched the Cancer Moonshot, saying: “I know that we can help solidify a genuine global commitment to end cancer as we know it today —  and inspire a new generation of scientists to pursue new discoveries and the bounds of human endeavor.”

The importance of high performance computing (HPC) in cancer research was recognized by the Cancer Moonshot Task Force report, and by then Vice President Joe Biden and Energy Secretary Ernie Monitz.

“Supercomputers are key to the Cancer Moonshot,” Monitz wrote. “These exceptionally high-powered machines have the potential to greatly accelerate the development of cancer therapies by finding patterns in massive datasets too large for human analysis. Supercomputers can help us better understand the complexity of cancer development, identify novel and effective treatments, and help elucidate patterns in vast and complex data sets that advance our understanding of cancer.”

With complex, non-linear signaling networks, multiscale dynamics from the quantum to the macro level, and giant, complex datasets of patient responses, cancer is quite possibly the ultimate in HPC problems.

“What could be more complicated and more important?” said J. Tinsley Oden, a computational researcher at The University of Texas at Austin applying uncertainty quantification to cancer treatment predictions. “At each step, it has the most complex features. It is really a garden of rich, important problems that are in the path of many of the developments that we’ve been working on for years.”

Infographic depicts TACC’s multi-domain approach to fighting cancer — click to expand

Hundreds of oncologists, biologists and computer scientists use the HPC systems at the Texas Advanced Computing Center (TACC) to understand the fundamental nature of cancer biology and to improve cancer treatments. Their work addresses a range of cancers types and treatment modalities, and spans applied or fundamental research.

Though diverse in their specific targets, the approaches they use can be loosely grouped into seven broad methodologies: molecular simulation; bioinformatics; mathematical modeling; computational treatment planning; quantum calculation; clinical trial design; and machine learning. The following sections describe and provide examples of each.

Molecular Simulations

Simulating protein and drug interactions at the molecular level enables scientists to understand the mechanics of cancer to design more effective treatments.

For Rommie Amaro, professor of Chemistry and Biochemistry at the University of California, San Diego, this means uncovering new pockets in tumor protein 53 (p53) — “the guardian of the genome” — which plays a crucial role in conserving the stability of DNA and preventing mutations.

The model of full-length p53 protein bound to DNA as a tetramer. The surface of each p53 monomer is depicted with a different color. [Courtesy: Özlem Demir, University of California, San Diego]
In approximately 50 percent of all human cancers, p53 is mutated and rendered inactive, therefore, reactivating mutant p53 using small molecules has been a long-sought-after anticancer therapeutic strategy.

In September 2016, writing in the journal Oncogene, Amaro reported results of the largest atomic-level simulation of the p53 to date — comprising more than 1.5 million atoms. The simulations, enabled by the Stampede supercomputer at TACC, helped identify new binding sites on the surface of the protein that could potentially reactivate p53.

“When most people think about cancer research they probably don’t think about computers,” she said. “But biophysical models are getting to the point where they have a great impact on the science.”

Virtual drug screening is another important HPC application for cancer research. Shuxing Zhang, professor of experimental therapeutics at MD Anderson Cancer Center, used molecule simulations on TACC’s Lonestar5 system to screen 1,448 Food and Drug Administration-approved small molecule drugs to determine which had the molecular features needed to bind and inhibit TNIK — an enzyme that plays a key role in cell signaling in colon cancer.

Zhang discovered that mebendazole, an FDA-approved drug that fights parasites, could effectively bind to TNIK and inhibit its enzymatic activity. He reported his results in Nature Scientific Reports in September 2016.

“Such advantages render the possibility of quickly translating the discovery into a clinical setting for cancer treatment in the near future,” Zhang wrote.

Bioinformatics

The human genome consists of three billion base pairs, so identifying single mutations by sight simply isn’t possible. For that reason, the field of bioinformatics — which uses computing and software to identify patterns and differences in biological data — has been an enormous boon for cancer researchers.

But bioinformatics is more than simple, one-to-one pattern matching.

A heat map showing differences in gene expression between primary tumors and cultured cell lines. Each row is a gene and each column is a tumor or cell sample. In the heat map, red indicates high expression and blue indicates low expression. NHA refers to normal human astrocytes, a star-shaped glial cell of the central nervous system. [Courtesy: Amelia Weber Hall, Iyer lab]
“When you move into multi-dimensional, time-series, or population-level studies, the algorithms can get a lot more computationally intensive,” said Matt Vaughn, TACC’s Director of Life Sciences Computing. “This requires resources like those at TACC, which help large numbers of researchers explore the complexity of cancer genomes by providing elastic, large-scale computing capability.”

For Vishy Iyer, a molecular biologist at The University of Texas at Austin (UT Austin), and his collaborators, access to TACC’s Stampede supercomputer helps them mine reams of data from The Cancer Genome Atlas to identify genetic variants and subtle correlations that affect gene expression in tumors.

“TACC has been vital to our analysis of cancer genomics data, both for providing the necessary computational power and the security needed for handling sensitive patient genomic datasets,” Iyer said.

In February 2016, Iyer and a team of researchers from UT Austin and MD Anderson Cancer Center reported in Nature Communications on a genome-wide transcriptome analysis of the two types of cells that make up the prostate gland. They identified cell-type-specific gene signatures that were associated with aggressive subtypes of prostate cancer and adverse clinical responses.

“This knowledge can be helpful in the development of more targeted therapies that seek to eliminate cancer at its origin,” Iyer said.

Using a similar methodology, Iyer and a team of researchers from UT Austin and the National Cancer Institute identified a transcription factor associated with an aggressive type of lymphoma that is highly correlated with poor therapeutic outcomes. They published their results in the Proceedings of the National Academy of Sciences in January 2016.

Whereas Iyer, an experienced HPC user, develops custom tools for his analyses, a much larger number of researchers access Stampede and comparable systems through scientific gateways. One prominent gateway is Galaxy, an open source bioinformatics platform that serves 30,000 researchers and runs more than 3,000 compute jobs a day.

Since 2014, TACC has powered the data analyses for a large percentage of Galaxy users, allowing researchers to solve tough problems in cases where their personal computer or campus cluster is not sufficient. Of those researchers, a significant subset use the site to analyze cancer genomes.

“Galaxy can be used to identify tumor mutations that drive cancer growth, find proteins that are overexpressed in a tumor, as well as for chemo-informatics and drug discovery,” said Jeremy Goecks, Assistant Professor of Biomedical Engineering and Computational Biology at Oregon Health and Science University and one of Galaxy’s principal investigators.

Goecks estimates that hundreds of researchers each year use the platform for cancer research, himself included. Because cancer patient data is closely protected, the bulk of this usage involves either publically available cancer data, or data on cancer cell lines – immortalized cells that reproduce in the lab and are used to study how cancer reacts to different drugs or conditions.

“This is an ideal marriage of TACC having tremendous computing power with scalable architecture and Galaxy coming along and saying, we’re going to go the last mile and make sure that people who can’t normally use this hardware are able to.”

Mathematical Modeling

While some researchers believe bioinformatics will rapidly advance the understanding and treatment of cancer, others think a better approach is to mathematize cancer: to uncover the fundamental formulas that represent how cancer, in its varied forms, behaves.

At the Center for Computational Oncology at UT Austin, researchers are developing complex computer models to predict how cancer will progress in a specific individual.

Each factor involved in the tumor response — whether it is the speed with which chemotherapeutic drugs reach the tissue or the degree to which cells signal each other to grow — is characterized by a mathematical equation that captures its essence. These models are combined and parameterized and initialized with patient-specific data.

In April 2017, writing in the Journal of The Royal Society Interface, Thomas Yankeelov and collaborators at UT Austin and Vanderbilt University, showed that they can predict how brain tumors (gliomas) will grow in mice with greater accuracy than previous models by including factors like the mechanical forces acting on the cells and the tumor’s cellular heterogeneity.

To develop and implement their mathematically complex models, the center’s scientists use TACC’s supercomputers, which enable them to solve bigger problems that they otherwise could and reach solutions far faster.

Recently, the group has begun a clinical study to predict, after one treatment, how an individual’s cancer will progress, and use those predictions to plan the future course of treatment.

“There are not enough resources or patients to sort this problem out because there are too many variables. It would take until the end of time,” Yankeelov said. “But if you have a model that can recapitulate how tumors grow and respond to therapy, then it becomes a classic engineering optimization problem. ‘I have this much drug and this much time. What’s the best way to give it to minimize the number of tumor cells for the longest amount of time?’”

Computing at TACC helps Yankeelov accelerate his research. “We can solve problems in a few minutes that would take us three weeks to do using the resources at our old institution,” he said. “It’s phenomenal.”

Quantum Calculations

X-ray radiation is the most frequently used form of radiation therapy, but a new treatment is emerging that uses a beam of protons to kill cancer cells with minimum damage on surrounding tissues.

“As happens in cancer therapy, we know empirically that it works, but we don’t know why,” said Jorge A. Morales, a professor of chemistry at Texas Tech University and a leading proponent of the computational analysis of proton therapy. “To do experiments with human subjects is dangerous, so the best way is through computer simulation.”

Computational experiments can mimic the dynamics of the proton-cell interactions without causing damage to a patient and can reveal what happens when the proton beam and cells collide from start to finish, with atomic-level accuracy. Morales has been simulating proton-cell chemical reactions using quantum dynamics models on TACC’s Stampede supercomputer to investigate the fundamentals of the process.

His studies, reported in PLOS One in March 2017, as well as in Molecular Physics, and Chemical Physics Letters (2015 and 2014 respectively), have determined the basic byproducts of protons colliding with water within the cell, and with nucleotides and clusters of DNA bases – the basic units of DNA. The studies shed light on how the protons and their water radiolysis products damage DNA.

Though fundamental in nature, the insights and data that Morales’ simulations produce help researchers understand proton cancer therapy at the quantum level, and help modulate factors like dosage and beam direction.

“These simulations will bring about a unique way to understand and control proton cancer therapy that, at a very low cost, will help to drastically improve the treatment of cancer patients without risking human subjects,” Morales said.

Computational Treatment Planning

Wei Liu, a researcher at the Mayo Clinic, also studies proton therapy, but he looks at the treatment from a clinical perspective.

In comparison with current radiation procedures, proton therapy saves healthy tissue in front of and behind the tumor. It is particularly effective when irradiating tumors near sensitive organs where stray beams can be particularly damaging.

However, the pinpoint accuracy required by the protein beam, which is its greatest advantage, means that it must be precisely calibrated and that discrepancies from the ideal (whether from device, human error or even patient breathing) must be taken into consideration.

Writing in Medical Physics in January 2017, Liu and his collaborators showed that their “chance-constrained model” was better at sparing organs at risk than current methods.

“Each time, we try to mathematically generate a good plan,” he said. “There are 25,000 variables or more, so generating a plan that is robust to these mistakes and can still get the proper dose distribution to the tumor is a large-scale optimization problem.”

The researchers used the Lonestar5 supercomputer at TACC to generate treatment plans that minimize the risk and uncertainties involved in proton beam therapy.

“It’s very computationally expensive to generate a plan in a reasonable timeframe,” he continued. “Without a supercomputer, we can do nothing.”

Computational Trial Design

Another way researchers use TACC’s advanced computers is to design clinical trials that can better determine which combination of dosages will be most effective, specifically for the biological agents used in immunotherapy, which work very differently from chemotherapy and radiation.

Writing in the Journal of the Royal Statistics Society Series C (Applied Statistics), Chunyan Cai, assistant professor of biostatistics at McGovern Medical School at The University of Texas Health Science Center at Houston (UTHealth) described her efforts using Lonestar5 to identify biologically optimal dose combinations for agents that target the PI3K/AKT/mTOR signaling pathway, which has been associated with several genetic aberrations related to the promotion of cancer.

Scanning electron micrograph of a human T lymphocyte (also called a T cell) from the immune system of a healthy donor. Immunotherapy fights cancer by supercharging the immune system’s natural defenses (include T-cells) or contributing additional immune elements that can help the body kill cancer cells. HPC is helping researchers better understand how immunotherapeutic agents can be used effectively [Courtesy: NIAID]
“Our research is motivated by a drug combination trial at MD Anderson Cancer Center for patients diagnosed with relapsed lymphoma,” Cai said. “The trial combined two novel biological agents that target two different components in the PI3K/AKT/mTOR signaling pathway.”

They investigated six different dose-toxicity and dose-efficacy scenarios and carried out 2,000 simulated trials for each of the designs.

Based on those simulations, she concluded that “the design proposed has desirable operating characteristics in identifying the biologically optimal dose combination under various patterns of dose–toxicity and dose–efficacy relationships.”

The research is leading to new, safer and more effective ways to test combinations of immunotherapeutic agents.

Machine Learning

A final, and truly radical, way that researchers are using HPC for cancer research is through the application of machine and deep learning.

The Eberlin research group at UT Austin develops clinical applications of ambient mass spectrometry for cancer diagnosis. They create tools and techniques to assist surgeons in distinguishing between normal and cancer tissue during tumor resection operations.

To do so, they have had to develop statistical methods that can analyze and interpret large amount of mass spectrometry data gathered from clinical samples.

Jonathan Young, a post-doctoral research in the group, is building machine learning classifiers to reliably predict whether a given tissue sample is cancer or normal, and if it is indeed cancer, which specific subtype the tumor belongs to.

Young uses the Maverick system at TACC, which contains a large number of NVIDIA GPUs, to develop and implement the machine learning algorithms. “The large memory capacity of Maverick is well suited for our extensive datasets, and the parallelization capability will aid in parameter sweeps during the training of classifiers,” Young said.

Young will present his work at the American Society for Mass Spectrometry (ASMS) Annual Conference this June.

Another example of the application of machine learning to cancer can be found in the work of Daniel Lobo, an assistant professor of biology and computer science at the University of Maryland, Baltimore County (UMBC). He is using machine learning to map out the cellular communication networks that underlie cancer, and to design methods to disrupt them.

In their January 2017 paper in Scientific Reports, Lobo and collaborators showed that machine learning can uncover the cellular networks that determine pigmentation in tadpoles and reverse-engineering never-before-seen coloration. Their work was facilitated by Stampede, which enabled the team to run billions of simulations to identify models of the cellular network and the means of altering it.

Lobo’s lab is applying the method to cancer research to determine what type of interventions might stop metastasis in its tracks without damaging other cells.

“Traditional approaches like chemotherapy attack the cells that grow the most, but leave cells that are signaling others to grow and that may be the most important,” Lobo says. “We’re using machine learning to find out the communication networks between these cells and hopefully to discover a treatment that can cause the tumor to collapse.”

“Getting a true understanding, given the complexity of the information, without some assistance from machine learning, is probably hopeless,” said Michael Levin, Lobo’s collaborator. “I think it’s inevitable that we use machine learning to enrich scientific and biomedical discovery.”

From patient-specific treatments to immunology to drug discovery, advanced computing is accelerating the basic and applied science underlying our understanding of cancer and the development and application of cancer treatments.

If scientists are the rocket in the cancer moonshot, HPC processing power is the jet fuel.

About the Author

Aaron Dubrow joined TACC in October 2007 as the Science and Technology Writer with the responsibility of reporting on the myriad of research and development projects undertaken by TACC.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “pre-exascale” award), parsed out additional information ab Read more…

By Tiffany Trader

Tsinghua Crowned Eight-Time Student Cluster Champions at ISC

June 22, 2017

Always a hard-fought competition, the Student Cluster Competition awards were announced Wednesday, June 21, at the ISC High Performance Conference 2017. Amid whoops and hollers from the crowd, Thomas Sterling presented t Read more…

By Kim McMahon

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out plans to push deeper into climate science and develop more gran Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale companies and their embrace of AI and deep learning – tha Read more…

By Doug Black

HPE Extreme Performance Solutions

Creating a Roadmap for HPC Innovation at ISC 2017

In an era where technological advancements are driving innovation to every sector, and powering major economic and scientific breakthroughs, high performance computing (HPC) is crucial to tackle the challenges of today and tomorrow. Read more…

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network designed to emulate and compete with the human brain. In thi Read more…

By Doug Black

Cray Brings AI and HPC Together on Flagship Supers

June 20, 2017

Cray took one more step toward the convergence of big data and high performance computing (HPC) today when it announced that it’s adding a full suite of big data and artificial intelligence software to its top-of-the-l Read more…

By Alex Woodie

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “global” launch event in Austin TX. In many ways it was a fu Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it, analysts and journalists want to report on it. Deep learni Read more…

By Doug Black

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Tsinghua Crowned Eight-Time Student Cluster Champions at ISC

June 22, 2017

Always a hard-fought competition, the Student Cluster Competition awards were announced Wednesday, June 21, at the ISC High Performance Conference 2017. Amid wh Read more…

By Kim McMahon

GPUs, Power9, Figure Prominently in IBM’s Bet on Weather Forecasting

June 22, 2017

IBM jumped into the weather forecasting business roughly a year and a half ago by purchasing The Weather Company. This week at ISC 2017, Big Blue rolled out pla Read more…

By John Russell

Intersect 360 at ISC: HPC Industry at $44B by 2021

June 22, 2017

The care, feeding and sustained growth of the HPC industry increasingly is in the hands of the commercial market sector – in particular, it’s the hyperscale Read more…

By Doug Black

At ISC – Goh on Go: Humans Can’t Scale, the Data-Centric Learning Machine Can

June 22, 2017

I've seen the future this week at ISC, it’s on display in prototype or Powerpoint form, and it’s going to dumbfound you. The future is an AI neural network Read more…

By Doug Black

Cray Brings AI and HPC Together on Flagship Supers

June 20, 2017

Cray took one more step toward the convergence of big data and high performance computing (HPC) today when it announced that it’s adding a full suite of big d Read more…

By Alex Woodie

AMD Charges Back into the Datacenter and HPC Workflows with EPYC Processor

June 20, 2017

AMD is charging back into the enterprise datacenter and select HPC workflows with its new EPYC 7000 processor line, code-named Naples, announced today at a “g Read more…

By John Russell

Hyperion: Deep Learning, AI Helping Drive Healthy HPC Industry Growth

June 20, 2017

To be at the ISC conference in Frankfurt this week is to experience deep immersion in deep learning. Users want to learn about it, vendors want to talk about it Read more…

By Doug Black

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Google Pulls Back the Covers on Its First Machine Learning Chip

April 6, 2017

This week Google released a report detailing the design and performance characteristics of the Tensor Processing Unit (TPU), its custom ASIC for the inference Read more…

By Tiffany Trader

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Facebook Open Sources Caffe2; Nvidia, Intel Rush to Optimize

April 18, 2017

From its F8 developer conference in San Jose, Calif., today, Facebook announced Caffe2, a new open-source, cross-platform framework for deep learning. Caffe2 is the successor to Caffe, the deep learning framework developed by Berkeley AI Research and community contributors. Read more…

By Tiffany Trader

Leading Solution Providers

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

US Supercomputing Leaders Tackle the China Question

March 15, 2017

Joint DOE-NSA report responds to the increased global pressures impacting the competitiveness of U.S. supercomputing. Read more…

By Tiffany Trader

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

DOE Supercomputer Achieves Record 45-Qubit Quantum Simulation

April 13, 2017

In order to simulate larger and larger quantum systems and usher in an age of “quantum supremacy,” researchers are stretching the limits of today’s most advanced supercomputers. Read more…

By Tiffany Trader

Messina Update: The US Path to Exascale in 16 Slides

April 26, 2017

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. Read more…

By John Russell

Knights Landing Processor with Omni-Path Makes Cloud Debut

April 18, 2017

HPC cloud specialist Rescale is partnering with Intel and HPC resource provider R Systems to offer first-ever cloud access to Xeon Phi "Knights Landing" processors. The infrastructure is based on the 68-core Intel Knights Landing processor with integrated Omni-Path fabric (the 7250F Xeon Phi). Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This