The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
June 19, 2008
The TeraGrid, the National Science Foundation's evolving program of cyberinfrastructure for U.S. science and education, held its third annual conference June 9-13 in Las Vegas. Observing three years of TeraGrid full-production operation, TG08 opened with a presentation from Dan Reed, one of the people most instrumental in TeraGrid's 2001 genesis as NSF's flagship cyberinfrastructure.
After founding and directing the Renaissance Computing Institute (RENCI) at the University of North Carolina, Reed moved to Microsoft, where he is Scalable and Multicore Computing Strategist. Before RENCI, he was director of NCSA in Illinois. There, building on the PACI Alliance notion of a distributed grid of shared resources, he helped to develop the TeraGrid vision. The idea, Reed reminded his audience of about 350 researchers, educators and TeraGrid staff, was "to begin escaping the tyranny of data captured at single supercomputing sites."
After looking back to TeraGrid's origins, Reed focused on the future. "What can we learn from the TeraGrid experience, technically and politically? Where is the technology going and what are the research implications?" He referred to a recent special issue of Nature that explores the state of science in 2020, noting that science in the 21st century is inextricable from computing.
Quoting from the study, "From sequencing genomes to monitoring the Earth's climate, many recent scientific advances would not have been possible without a parallel increase in computing power -- and with revolutionary technologies such as the quantum computer edging towards reality, what will the relationship between computing and science bring us over the next 15 years?"
As befitted the Las Vegas setting, Reed asked his audience to ponder risk versus reward. "What probability of successful return would you accept to be the first human to set foot on Mars?" Twenty years ago, he noted, grids were research curiosities and a terabyte was many disks of data. "The future depends on vision and context."
The context has radically changed from not that long ago, he noted, in that bulk computing has become almost free relative to software and power. "Nowadays you can buy a lot of computing on your credit card. We still don't have terabit transcontinental networks for research use; moving lots of data is still hard. The big cost is people. The cost of a professional software developer for a year is now more than a teraflop computing cluster."
Today's context, says Reed, is a Five-Fold Way comprising 1) many-core on-chip parallelism, 2) big "really big" datacenters, 3) web services, 4) ubiquitous sensors (producing huge data volumes), and 5) "clouds" as an evolving model of computational service. Today, further increases in computer performance require embracing multicore parallelism; hardware progress has outstripped progress in software to exploit it.
An important goal, Reed emphasized, is context-aware information. Referring to Vannevar Bush's vision of a national research enterprise, which led eventually to the National Science Foundation, Reed called for services, including datacenters, and the concept of cloud computing that has the ability to put the right information in the right heads at the right time.
Data models, noted Reed, are in rapid flux because of larger and larger data volumes. This is especially pronounced in some fields, such as biomedical research, where large databases are subject to distributed analysis. A big challenge, probably underappreciated, says Reed, is the scale of the data deluge. "We will be running queries on 100,000 servers," said Reed. "And research is moving from being hypothesis driven ("I have an idea, let me verify it.") to exploratory ("What correlations can I glean from everyone's data?"). This kind of exploratory analysis will rely on tools for deep data-mining." Massive, multi-disciplinary data, said Reed, is rising rapidly and at unprecedented scale.
In discussing next-generation applications and cyberinfrastructure investment, Reed noted that the historical model of "punctuated competitions" is not optimal in that it tends to stress a culture of competition among research centers over long-term collaboration. Research and infrastructure, he noted, mix badly since "it takes a long time to identify appropriate practices and software." Sustainability really matters because software and organizations take time.
Grids and clouds, says Reed, will tend to fuse with time. The rapid growth in the size and capability of commercial computing clouds, as exemplified by work underway at Microsoft, is driven by economics. Reliable, centrally hosted infrastructure provides commercially-based services. While grids are more tailored for academic agendas, economic factors will tend to bring these two related service models together in a fusion that is more than the sum of its parts.
Returning to the ratio of risk and reward as he concluded, Reed stressed the need to ask big questions. In Reed's view, there are basically three: 1) biology, understanding of life and nature, 2) the universe, how matter came to be and cosmic structure, and 3) the human condition, where biology and the universe intersect in the sphere of human creativity and social life. Answering the big questions requires boldness and interdisciplinary partnerships. With the three-fold way of science -- theory, simulation and experiment -- now proven, says Reed, "Great things are ahead. We are positioned to do amazing things."
(Digg, Technorati, more)
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...
Mar 15 | The Register | EMC's grand vision for unified global storage. Read more...
Mar 15 | Data Center Knowledge | Company delivers UCS-container solution to NASA. Read more...
Mar 11 | Linux Magazine | CUDA may be the rage, but OpenCL is a standard that has some features you may need. Read more...
Mar 09 | Free Software Magazine | Data-driven computing will need open software. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html