July 17, 2012
BLOOMINGTON, Ind., July 17 -- Key software used to study gene expression now runs four times faster, thanks to performance improvements put in place by a team from the Indiana University Pervasive Technology Institute (PTI), the Broad Institute of MIT and Harvard and Technische Universität Dresden.
The timesaving breakthroughs will allow bioinformaticians and biologists who study RNA sequences to analyze more data in a shorter amount of time. This will speed the understanding of biological processes in fields as diverse as ecology, evolution, biofuels and medicine.
Robert Henschel and Richard D. LeDuc, of PTI and IU's National Center for Genome Analysis Support (NCGAS), announced the findings today at the XSEDE12 conference in Chicago. Henschel and LeDuc, along with partners from the Broad Institute and the Center for Information Services and High Performance Computing (ZIH) at Technische Universität Dresden, teamed up to announce this advance in a fast-growing area of computational biology.
The software, known as Trinity, was developed by researchers at the Broad Institute and Hebrew University. It produces high-quality RNA sequence assemblies used by scientists studying gene expression. These RNA sequence assemblies allow scientists to know which genes are active within a living creature. Trinity is especially useful for studying organisms without a complete genome sequence, such as agricultural pests, ecological indicator species and human parasites.
The software has long been considered a leader in the field, but it needed some fine tuning.
"IU research technologists strive to deliver tools and services that accelerate discoveries for scientists all over the world. By collaborating with our counterparts at Broad and ZIH, we were able to do just that with Trinity. This is just one example of how the various centers affiliated with PTI—such as NCGAS—improve the capabilities of scientists at home and abroad," said Craig Stewart, executive director of IU's Pervasive Technology Institute and principal investigator of the National Science Foundation grant that funds NCGAS.
"In the past, Trinity was a high quality tool but the run time was too long," said Henschel. "Now with our performance improvements, it runs as fast as the competition—if not faster—and still produces superior quality sequence assemblies."
The partners first used standard high performance computing techniques to improve the software's speed. Specifically, this involved building Trinity with an optimizing compiler for the Intel® Xeon® architecture and using optimizing compiler flags. In addition, the team properly configured the application to take full advantage of multicore, multisocket compute nodes in today's clusters.
Next, the team finetuned each part of the Trinity package to improve the overall scalability of the application. They used Vampir performance analysis tools, developed at ZIH, to gain insights into the software's performance. The optimizations included improving and parallelizing input/output, simplifying data structures for better performance and optimizing parallel regions in the application.
Henschel is hopeful that IU's work with Trinity will continue. "We are working on establishing a continued collaboration between IU, Broad and ZIH to further optimize Trinity," said Henschel. "We hope these performance improvements are just the beginning of a longer term relationship that will continue to benefit biological research."
About XSEDE12 and XSEDE
XSEDE12 is the first conference of the Extreme Science and Engineering Discovery Environment (XSEDE), a national collaboration that provides cyberinfrastructure services and resources to support scientific discovery in fields such as medicine, engineering, earthquake science, epidemiology, genomics, astronomy and biology.
XSEDE is funded through a five-year, $121 million National Science Foundation (NSF) grant. For more, see http://www.xsede.org.
About Indiana University Pervasive Technology Institute
The Pervasive Technology Institute is IU's flagship initiative for advanced information technology research, development and delivery in support of research, scholarship and artistic performances. The National Center for Genome Analysis Support (which includes LeDuc) and the Research Technologies division (which includes Robert Henschel) are both Service and CyberInfrastructure Centers affiliated with PTI. For more, see http://pti.iu.edu.
About the Broad Institute of MIT and Harvard
The Eli and Edythe L. Broad Institute of Harvard and MIT was launched to empower creative scientists to transform medicine. The Broad Institute seeks to describe all the molecular components of life and their connections; discover the molecular basis of major human diseases; develop effective new approaches to diagnostics and therapeutics; and disseminate discoveries, tools, methods and data openly to the entire scientific community. For more, see http://www.broadinstitute.org.
About the Center for Information Services and High Performance Computing at Technische Universität Dresden
The Center for Information Services and High Performance Computing (ZIH) at Technische Universität Dresden in Germany supports other departments and institutions in their research and education for all matters related to information technology and computer science. For more, see http://www.tu-dresden.de/zih.
-----
Source: Indiana University
In a recent solicitation, the NSF laid out needs for furthering its scientific and engineering infrastructure with new tools to go beyond top performance, Having already delivered systems like Stampede and Blue Waters, they're turning an eye to solving data-intensive challenges. We spoke with the agency's Irene Qualters and Barry Schneider about..
Read more...
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
Read more...
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
Read more...
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
Read more...
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.