SCIENCE & ENGINEERING NEWS
San Diego, CALIF. — Picture this: millions of iMac and PC owners around the world using their home computers to help scientists solve complex computational problems. Louisa Dalton reports that it may sound far-fetched, but the concept – known as distributed computing – has become a groundbreaking tool for astronomers, biochemists and other researchers seeking a fast and cheap alternative to expensive supercomputers.
Distributed computing can be a valuable asset in virtually any computationally intensive experiment, according to Vijay S. Pande, an assistant professor of chemistry at Stanford. “A handful of projects have already demonstrated how such large-scale distributed computing power can be utilized,” write Pande and chemistry graduate student Michael Shirts in the Dec. 8 issue of the journal Science.
A well-known example cited by the authors is [email protected], a scientific experiment based in Berkeley, Calif., that uses home computers in the Search for Extraterrestrial Intelligence.
[email protected] gives anyone connected to the Internet an opportunity to hunt for signs of intelligent life in the universe by analyzing radio signals from outer space. Volunteers simply download the [email protected] screensaver and software. While they are away from their computers, the screensaver pops up and begins processing the radio signals. Meanwhile, the software automatically checks in at a central website to drop off results and pick up new assignments.
Roughly a half-million users now run [email protected] “This large number of processors dwarfs even the largest supercomputers,” say Shirts and Pande. They point out that, in just three years, the project accomplished what a single computer would have taken 400,000 years to do.
But [email protected] is only the beginning. “There are at least 300 million personal computers on the Internet,” write the authors, but up to 90 percent of all PC processing time is wasted, they say, creating a massive untapped reservoir of potential computing power worldwide.
Shirts and Pande estimate that if only half of all PCs now connected to the Internet participated in distributed computing, there would be sufficient capacity for 300 SETI-sized projects – everything from climate modeling and robotic design to nuclear reaction simulations.
“Perhaps the most exciting possibility, however, is in the biological realm,” say the authors. “In the last few years, the huge amount of raw scientific data generated by molecular biology, structural biology and genomics has outstripped the analytical capabilities of modern computers,” they write.
“The exciting thing about distributed computing right now,” Pande adds, “is that there are a lot of interesting biological questions that are at the moment too difficult for single computers.”
One example is protein folding, often called the Holy Grail of molecular biology. The human body produces thousands of different kinds of proteins. Each one has to fold into a specific, three-dimensional shape to function properly. Some proteins resemble intricate pretzels, while others are twisted and woven into braids.
Pande and his laboratory team have done extensive research on protein folding. “What makes it such a great challenge,” he says, “is its complexity, which renders simulations of folding extremely computationally demanding and difficult to understand.”
About two months ago, Pande and his Stanford research team launched [email protected], an Internet program that calculates how proteins achieve their three-dimensional shape. The project has taken off so well that Pande now has some 10,000 volunteers doing biochemistry research for him on their home computers.
David Noblet runs [email protected] on about 10 computers in his house in New Hampshire. He`s a computer programmer who likes the idea of using his extra PCs for the good of society. “It feels like a waste to have all that potential sitting around all day,” Noblet says.
He also is a member of Team Egg Roll, an informal group of [email protected] users vying for bragging rights over who can tally the most computing days.
“The [email protected] project posts statistics for individuals and teams to help foster friendly competition and generate more interest in running the software,” Pande says. Team Egg Roll is now in second place, having reached 10,000 folding days, but it lags far behind the top-ranked team, Nerdz, whose members have clocked 17,000 days – or about 47 years.
[email protected] is one of several new bioscience distributed systems launched in the past year. Others are focusing on HIV drug design, flu vaccine modeling and cancer therapy.
Pande is quick to point out that it`s not easy setting up thousands of home computers to process mounds of raw genetic and protein data.
“Just giving someone 100,000 computers doesn`t solve the problem,” he maintains. “It`s like giving someone 100,000 secretaries,” he adds. “What you need is a way to organize these guys and come up with ways that you could actually use all the secretaries. Otherwise, you end up wasting them.”
“Pande says that, in distributed computing, every task must be carefully divided into bite-size pieces that a single computer can handle.
With [email protected], the process is fairly straightforward. Since each computer looks at its own bit of sky, it`s like handing 100,000 secretaries their own separate page of dictation to type up.
But biological computations, such as protein folding, are more difficult to parcel out. Sometimes, for example, one computer has to wait for another to finish its bit of the protein puzzle.
The biologist`s challenge, says Pande, is to break up the mammoth problem efficiently into workable, single-computer chunks.
In all distributed computing projects, it`s the volunteers, not the scientists, who end up doing the bulk of the labor. So finding volunteers willing to run simulation software at home is an essential first step for any successful project, according to Shirts and Pande.
“The user must have some interest in volunteering his or her computer,” they write. [email protected], for example, has spawned a great deal of international excitement among space aficionados.
“Biological and biomedical applications may have an even greater potential for generating public interest,” the authors say. Pande carefully protects the privacy of his [email protected] volunteers, but he says that many of them are in health professions.
“We get a lot of people in medicine, in hospitals – people who are just interested in biology,” he says. Another important point, write Shirts and Pande: “Distributed systems must not interfere with the user`s personal use. This is most commonly (and perhaps most elegantly) done using screensavers.” The authors note that screensavers also allow the vast majority of idle computer time to be utilized for the project.
Another benefit of distributed computing is that it gives people unprecedented access to the world of experimental science.
“The involvement of hundreds of thousands of nonscientists in research opens the door to new means of science education and outreach, in which the public becomes an active participant,” the authors conclude.
“Part of our mission is to educate people,” says Pande, so when volunteers check his [email protected] website, they are provided a “folding fact of the day,” along with easy-to-understand background information about the project.
Pande knows he must treat his volunteers well. After all, they picked his screensaver over flying toasters.