Despite the highly profitable nature of the pharmaceutical business and the large amount of R&D money companies throw at creating new medicines, the pace of drug development is agonizingly slow. Over the last few years, on average, less than two dozen new drugs have been introduced per year. One of the more promising technologies that could help speed up this process is supercomputing, which can be used not only to find better, safer drugs, but also to weed out those compounds that would eventually fail during the latter stages of drug trials.
According to a 2010 report in Nature, big pharma spends something like $50 billion per year on drug research and development. (To put that in perspective, that’s four to five times the total spend for high performance computing.) The Nature report estimates the price tag to bring a drug successfully to market is about $1.8 billion, and rising. A lot of that cost is due to the high attrition rate of drugs, which is caused by problems in absorption, distribution, metabolism, excretion and toxicity that gets uncovered during clinical trials.
Ideally, the drug makers would like know which compounds were going to succeed before they got to the expensive stages of development. That’s where high performance computing can help. The approach is to use molecular docking simulations on the computer to determine if the drug candidate can bind to the target protein associated with the disease. The general idea is to find the key (the small molecule drug) that fits in the lock (the protein).
AutoDock, probably the most common molecular modeling application for protein docking, is a one of the more popular software package used by the drug research community. It played a role in developing some of the more successful HIV drugs on the market. Fortunately, AutoDock is freely available under the GNU General Public License.
The trick is to do these docking simulations on a grand scale. Thanks to the power of modern HPC machines, millions of compounds can now be screened against a protein in a reasonable amount of time. In truth, that timeframe is dependent upon how many cores you can put to the task. For a typical medium-sized cluster that a drug company might have in-house, it would take several weeks to screen just a few thousand compounds against one target protein. To reach a more interactive workflow, you need a something approaching a petascale supercomputer.
But not necessarily an actual supercomputer. Compute clouds have turned out to be very suitable for this type of embarrassing parallel application. For example, in a recent test with 50,000 cores on Amazon’s cloud (provisioned by Cycle Computing), software was able to screen 21 million compounds against a protein target in less than three hours.
Real supercomputers work too. At Oak Ridge National Lab (ORNL), researchers there used 50,000 cores of Jaguar to screen about 10 million drug candidates in less than a day. Jeremy C. Smith, director of the Center for Molecular Biophysics at ORNL, believes his type level of virtual screening is the most cost-effective approach to turbo-charge the drug pipeline. But the real utility of the supercomputing approach, says Smith, is that it can also be used to screen out drugs with toxic side effects.
Toxicity is often hard to detect until it comes time to do clinical trials, the most expensive and time-consuming phase of drug development. Worse yet, sometimes toxicity is not discovered until after the drug has been approved and released into the wild. So identifying these compounds early has the potential to save lots of money, not to mention lives. As Smith says, “If drug candidates are going to fail, you want them to fail fast, fail cheap.”
At the molecular level, toxicity is caused by a drug binding to the wrong protein, one that is actually needed by the body, rather than just selectively binding to the protein causing the condition. The problem is humans have about a thousand proteins, so every potential compound needs to be checked against each one. When you’re working with millions of drug candidates, the job becomes overwhelming, even for the petaflop supercomputers of today. To support the toxicity problem, you’ll need an exascale machine, says Smith.
Besides screening for toxicity, the same exascale setup can be used to repurpose existing drugs for other medical conditions. That is, the drug docking software could use approved drugs as the starting point and try to match them against various target proteins know to cause disease. Right now, drug repurposing is typically discovered on a trial-and-error basis, but the increasing number of compounds that are now in this multiple-use category suggests this could be rich new area of drug discovery.
In any case, sheer compute power is not the complete answer. For starters, the software has to be scaled up to the level of the hardware, and on an exascale machine, that hardware is more than likely going to be based on heterogenous processors. But since the problem is easily parallelized (each docking operation can be performed independently of one another), at least the scaling aspect should be relatively easy to overcome.
The larger problem is that the molecular modeling software itself is imperfect. Unlike a true lock and key, proteins are dynamic structures, and the action of binding to a molecule changes their shape. Therefore, physics simulation is also required to get a more precise match.
AutoDock, for example, is only able to provide a crude match between drug and protein. To get higher fidelity docking, more compute-intensive algorithms are required. Researchers, like those at ORNL, often resort to more precise molecular dynamics codes after getting performing a crude screening run with AutoDock.
None of this is a guarantee that virtual docking on exascale machines is going to launch a golden age of drugs. It’s possible that researchers will discover that there are just a handful of small molecule compounds that actually exhibit both disease efficacy and are non-toxic. But Smith believes this approach is full of promise. “This is the way to design drugs since this mirrors the way nature works,” he says.