Biotech and pharmaceutical organizations are increasingly looking to apply Grid computing to their research efforts.
Typically, life science software that run on Grids include a wide range of applications, with the most common ones being molecule screening algorithms and DNA sequence analysis routines.
The nature of these applications makes them good candidates for Grid computing. For example, in the case of molecule screening, the typical computational problem involves checking to see if any of the millions to billions of molecules in collection have the right 3-D shape and chemical properties to potentially be used to fight a disease. The common approach here is to give every Grid node the disease target against which each molecule will be tested. Then, the Grid application divides up the collection of molecules and distributes them to the nodes. This type of application is often called a molecular docking application.
With the DNA sequence routines, the nodes hold portions of a genome database and then the genetic sequence to be compared is sent to all the nodes to be checked against the larger database.
Such applications are widely used throughout the biotech and pharmaceutical industry. Still, the actual use of Grid computing varies greatly from organization to organization. However, there are two distinct scenarios for using Grids within the life sciences.
In one common approach, a company sets up an internal Grid that complements its existing high performance computing operations. In the other approach, an organization, such as a university or a group dedicated to fighting a particular disease, asks people to essentially donate spare PC compute cycles to speed up research efforts.
Notable examples in the first category include Grid projects at Novartis and Johnson & Johnson.
For example, about 18 months ago, Novartis' Grid effort started with a 50-node pilot project that quickly grew to a Grid that included about 2,700-plus office PCs. The Grid ran common bioinformatics routines including sequence analysis and molecular docking algorithms. Once the pilot was up and running, the company claimed the Grid's processing power was about 5 teraflops (5 trillion floating point operations per second). If that performance were sustained and benchmarked, it would be on a par roughly with the world's 30th most powerful supercomputers.
Processing power is one thing, results are another. The Grid project didn't have specific goals, but it was thought that the extra processing power might help Novartis identify up to 10 times more potential drug targets per year.
The Grid immediately helped in making a scientific discovery. Running a docking program, the Grid screened the corporate library of compounds and found a previously unknown potential cancer inhibitor called a protein kinase CK2 inhibitor. The results were published in the Journal of Medicinal Chemistry.
In many life science companies, Grid efforts have been departmental in nature. But noting the increased computing resources that a wider-scale effort would deliver and the potential for making faster scientific discoveries, some companies are making Grids a corporate venture.
That is the case with Johnson & Johnson, which earlier this year expanded its research and development Grid efforts from discrete departmental projects into a company-wide initiative. The idea was to deploy a single global Grid that would host many applications and be centrally managed.
The Grid project is being carried out under the purview of the J&J Pharma R&D IM (Information Management) group. A pilot project started earlier this year was expected to grow the Grid from about 450 nodes to 3,000 nodes by the third or fourth quarter of this year.
Philanthropic Grids
Outside of the corporate arena, Grids are also being used by organizations to help conduct basic research into common diseases. Many of these efforts are philanthropic projects run by research organizations.
There are many of these projects, which are similar to SETI@Home where people are asked to download some software and let the organization take advantage of the spare CPU cycles on a home computer. Examples of these types of Grids include the Scripps Research Institute's FightAids@Home project, the Smallpox Research Grid Project and the World Community Grid.
Most of these efforts are molecule screening projects. For instance, the goal of the Smallpox Research Grid Project is to screen about 35 million molecules against a handful of target proteins.
The World Community Grid project takes a slightly different research approach. Its participants are helping examine how proteins fold. The information derived about protein folding is useful when trying to find treatments and cures for disease such as cancer, HIV/AIDS, malaria and SARS.
Most of these philanthropic efforts have a technology partner including companies like IBM, United Devices and others. These partners supply to underlying Grid infrastructure and management tools that allow the organization to coordinate and run its research on a Grid. Often times, there is also a life science software partner that, for example, makes an application Grid-enabled.
The trend in this philanthropic Grid area is simply to get more people to participate.
The bottom line is that Grid computing is increasingly being called on within biotech and pharmaceutical companies, as well as by research organizations, to accelerate research in the life sciences. Most of these efforts have the ultimate goal of finding new drugs to treat and cure diseases.
About Salvatore Salamone
Salvatore Salamone is the senior IT editor at Bio-IT World (www.bio-itworld.com). He will be chairing a three day IT Solutions for Drug Discovery conference track May 17 to May 19 at the Bio-IT World Conference & Expo, to be held in the Hynes Convention Center in Boston. Several of the talks in that conference track will focus on the use of Grid computing in the life sciences. More information about the conference can be found at www.bio-itworldexpo.com.