Visit additional Tabor Communication Publications
July 31, 2012
Despite the highly profitable nature of the pharmaceutical business and the large amount of R&D money companies throw at creating new medicines, the pace of drug development is agonizingly slow. Over the last few years, on average, less than two dozen new drugs have been introduced per year. One of the more promising technologies that could help speed up this process is supercomputing, which can be used not only to find better, safer drugs, but also to weed out those compounds that would eventually fail during the latter stages of drug trials.
According to a 2010 report in Nature, big pharma spends something like $50 billion per year on drug research and development. (To put that in perspective, that's four to five times the total spend for high performance computing.) The Nature report estimates the price tag to bring a drug successfully to market is about $1.8 billion, and rising. A lot of that cost is due to the high attrition rate of drugs, which is caused by problems in absorption, distribution, metabolism, excretion and toxicity that gets uncovered during clinical trials.
Ideally, the drug makers would like know which compounds were going to succeed before they got to the expensive stages of development. That's where high performance computing can help. The approach is to use molecular docking simulations on the computer to determine if the drug candidate can bind to the target protein associated with the disease. The general idea is to find the key (the small molecule drug) that fits in the lock (the protein).
AutoDock, probably the most common molecular modeling application for protein docking, is a one of the more popular software package used by the drug research community. It played a role in developing some of the more successful HIV drugs on the market. Fortunately, AutoDock is freely available under the GNU General Public License.
The trick is to do these docking simulations on a grand scale. Thanks to the power of modern HPC machines, millions of compounds can now be screened against a protein in a reasonable amount of time. In truth, that timeframe is dependent upon how many cores you can put to the task. For a typical medium-sized cluster that a drug company might have in-house, it would take several weeks to screen just a few thousand compounds against one target protein. To reach a more interactive workflow, you need a something approaching a petascale supercomputer.
But not necessarily an actual supercomputer. Compute clouds have turned out to be very suitable for this type of embarrassing parallel application. For example, in a recent test with 50,000 cores on Amazon's cloud (provisioned by Cycle Computing), software was able to screen 21 million compounds against a protein target in less than three hours.
Real supercomputers work too. At Oak Ridge National Lab (ORNL), researchers there used 50,000 cores of Jaguar to screen about 10 million drug candidates in less than a day. Jeremy C. Smith, director of the Center for Molecular Biophysics at ORNL, believes his type level of virtual screening is the most cost-effective approach to turbo-charge the drug pipeline. But the real utility of the supercomputing approach, says Smith, is that it can also be used to screen out drugs with toxic side effects.
Toxicity is often hard to detect until it comes time to do clinical trials, the most expensive and time-consuming phase of drug development. Worse yet, sometimes toxicity is not discovered until after the drug has been approved and released into the wild. So identifying these compounds early has the potential to save lots of money, not to mention lives. As Smith says, "If drug candidates are going to fail, you want them to fail fast, fail cheap."
At the molecular level, toxicity is caused by a drug binding to the wrong protein, one that is actually needed by the body, rather than just selectively binding to the protein causing the condition. The problem is humans have about a thousand proteins, so every potential compound needs to be checked against each one. When you're working with millions of drug candidates, the job becomes overwhelming, even for the petaflop supercomputers of today. To support the toxicity problem, you'll need an exascale machine, says Smith.
Besides screening for toxicity, the same exascale setup can be used to repurpose existing drugs for other medical conditions. That is, the drug docking software could use approved drugs as the starting point and try to match them against various target proteins know to cause disease. Right now, drug repurposing is typically discovered on a trial-and-error basis, but the increasing number of compounds that are now in this multiple-use category suggests this could be rich new area of drug discovery.
In any case, sheer compute power is not the complete answer. For starters, the software has to be scaled up to the level of the hardware, and on an exascale machine, that hardware is more than likely going to be based on heterogenous processors. But since the problem is easily parallelized (each docking operation can be performed independently of one another), at least the scaling aspect should be relatively easy to overcome.
The larger problem is that the molecular modeling software itself is imperfect. Unlike a true lock and key, proteins are dynamic structures, and the action of binding to a molecule changes their shape. Therefore, physics simulation is also required to get a more precise match.
AutoDock, for example, is only able to provide a crude match between drug and protein. To get higher fidelity docking, more compute-intensive algorithms are required. Researchers, like those at ORNL, often resort to more precise molecular dynamics codes after getting performing a crude screening run with AutoDock.
None of this is a guarantee that virtual docking on exascale machines is going to launch a golden age of drugs. It's possible that researchers will discover that there are just a handful of small molecule compounds that actually exhibit both disease efficacy and are non-toxic. But Smith believes this approach is full of promise. "This is the way to design drugs since this mirrors the way nature works," he says.
Jun 19, 2013 |
Supercomputer architectures have evolved considerably over the last 20 years, particularly in the number of processors that are linked together. One aspect of HPC architecture that hasn't changed is the MPI programming model.
Jun 18, 2013 |
The world's largest supercomputers, like Tianhe-2, are great at traditional, compute-intensive HPC workloads, such as simulating atomic decay or modeling tornados. But data-intensive applications--such as mining big data sets for connections--is a different sort of workload, and runs best on a different sort of computer.
Jun 18, 2013 |
Researchers are finding innovative uses for Gordon, the 285 teraflop supercomputer housed at the San Diego Supercomputer Center (SDSC) that has a unique Flash-based storage system. Since going online, researchers have put the incredibly fast I/O to use on a wide variety of workloads, ranging from chemistry to political science.
Jun 17, 2013 |
The advent of low-power mobile processors and cloud delivery models is changing the economics of computing. But just as an economy car is good at different things than a full size truck, an HPC workload still has certain computing demands that neither the fastest smartphone nor the most elastic cloud cluster can fulfill.
Jun 14, 2013 |
For all the progress we've made in IT over the last 50 years, there's one area of life that has steadfastly eluded the grasp of computers: understanding human language. Now, researchers at the Texas Advanced Computing Center (TACC) are utilizing a Hadoop cluster on its Longhorn supercomputer to move the state of the art of language processing a little bit further.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
Join HPCwire Editor Nicole Hemsoth and Dr. David Bader from Georgia Tech as they take center stage on opening night at Atlanta's first Big Data Kick Off Week, filmed in front of a live audience. Nicole and David look at the evolution of HPC, today's big data challenges, discuss real world solutions, and reveal their predictions. Exactly what does the future holds for HPC?
Join our webinar to learn how IT managers can migrate to a more resilient, flexible and scalable solution that grows with the data center. Mellanox VMS is future-proof, efficient and brings significant CAPEX and OPEX savings. The VMS is available today.