Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them
April 24, 2009
COLLEGE PARK, Md., April 23 -- DNA sequencing is the next frontier in biological research. As new sequencing technology becomes more efficient and affordable, it is increasingly available to small laboratories. Thus, sequencing data is being generated at a faster rate than ever before.
However, the computing capacity needed to analyze such vast amounts of data still has some catching up to do. Large networks of interconnected computers, called computer clusters, are required to analyze these data. Expensive to establish and maintain, these computer clusters are generally available only to labs that can afford them.
Enter Mihai Pop, an assistant professor in the department of computer science and in the Center for Bioinformatics and Computational Biology at the University of Maryland. He and colleague Steven Salzberg, director of the center and Horvitz Professor of computer science, recently received a grant from the National Science Foundation Cluster Exploratory Program (CluE) to fund research aimed at discovering how remote cluster computers, computer networks available over the Internet, might be used to process DNA sequence data.
"There is a new initiative by NSF to figure out what you can do with cluster computers on the internet -- like the ones through Amazon, Google, and IBM," Pop said. "Our NSF grant will be used to find out if remote clusters of computers are a better option for DNA sequence analysis than local clusters of computers."
Pop's goal is to develop the software required to analyze sequence data in parallel (on many computers simultaneously). This massively parallel computing allows faster gene sequence alignment and genome assembly.
While parallel computing is already being used on locally maintained computer clusters, Pop will be working on programs that will allow researchers to perform their DNA sequence over the web by accessing remote computer clusters maintained by large companies on a pay-per-use basis. This paradigm is known as cloud computing.
So now, rather than buying and maintaining their own computer systems, researchers may simply be able to rent computer time at a fraction of the cost. But there are a few obstacles to overcome before Cloud Computing becomes a reality for genetic analysts.
"The first question is how to best split up the process of DNA sequence analysis to fit these computer clusters," Pop said. "The second is whether or not the benefits of cloud computing outweigh the costs of data transfer and storage."
The massive amounts of data generated by just one genome may take a significant amount of time to transfer over the internet. This, in addition to the data storage needed before analysis, might add costs that outweigh the benefits of using a remote computer cluster.
"Even if the analysis doesn't take long, the transfer may take forever and cost too much to make whole thing worthwhile," said Pop.
Page: 1 of 3(Digg, Technorati, more)
The naming of Michael Norman as director of the San Diego Supercomputer Center (SDSC) last week was long overdue. SDSC has been without an official director for more than 14 months, with Norman filling the spot as the interim head since last July. The appointment could mark something of a comeback for the center, which has not only gone director-less during this time, but has been operating without a high-end supercomputer as well.
Read More...
The National Science Foundation has awarded funding to four projects as part of the Future Internet Architecture program; and the 3PAR bidding war is won by HP. We recap those stories and more in our weekly wrapup.
Read More...
Intel Corp has released Parallel Studio 2011, a set of four tools designed to mainstream software development on multicore x86 architectures. The update folds in a number of parallel programming technologies that the company has acquired or developed independently over the past few years, including the Cilk Arts and RapidMind technologies, and Intel's own Ct data parallel language framework.
Read More...
Sep 08 | Closing the gap between HPC and mainstream IT. Read more...
Sep 07 | Clues left on social media sites help singles find love. Read more...
Sep 03 | Should engineers take advantage of GPU computing? Read more...
Sep 02 | Could see first products in three years. Read more...
Sep 01 | A hand-picked selection of video presentations from the TED conference -- because the next big thing has to start somewhere. Read more...
Jul 20 | | BlueArc's network storage systems are compelling solutions for the evolving and unpredictable needs of an NGS environment. They offer significant performance, scalability, utilization and cost benefits, while catering to the manageability needs of various users in a research organization.
Jul 29 | | Panasas storage solutions deliver high throughput with many concurrent backup IO streams to standard backup applications such as Veritas NetBackup™ or EMC® NetWorker™. Download this whitepaper to understand the essential elements for effective backup and restore: the tape subsystem, networking, file system workload and administrative policy.
In this webinar you will hear about the current storage challenges facing the HPC community, how Panasas storage solutions provide exceptional performance, scalability, and manageability, and how you can achieve the lowest total Cost of Ownership with a system that installs and configures in 15 minutes.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.