HPCwire

The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing

HPCwire >> Features

Sun's Hero Program: Changing the Productivity Game


Page:  1  of  3
1 | 2 | 3   All  »  

Sun Microsystems sees a future where developers will be able to write tera- or peta-scale applications just as easily as they write applications for just dozens or perhaps hundreds of processors today. Where rich bandwidth, low latencies, very high levels of fault tolerance, and a highly integrated toolset allow researchers to focus on "scaling the program and not the programmers."

Not long ago, a computational scientist could personally write, debug and optimize code to run on a leadership class high performance computing system without the help of others. Today, things are much harder: the programming for a cluster of machines is significantly more difficult than traditional programming, and the scale of the machines and problems has increased more than 1,000 times. Also, simply owning and running high-end computational facilities for nuclear research, seismic modeling, gene sequencing or business intelligence, takes sizeable investment in terms of staffing, procurement and operations. The complexities associated with HPC continue to increase, and as a result, many advances and scientific discoveries are hampered. For organizations that can afford to staff a sizeable team, it is often the case that the resulting application achieves only 5 to 10 percent of the theoretical peak performance of the system. Often, applications must be restarted from scratch every time a hardware or software failure interrupts the job. The trend toward diminishing productivity associated with coding, debugging, optimizing, modifying, over-provisioning hardware, and even just simply running high-end applications is alarming.

To fill this high-end technology and capability gap, protect critical national security missions, and ensure a new generation of economically viable systems, the United States' Defense Advanced Research Projects Agency (DARPA) has set some very demanding goals for the High Productivity Computing Systems (HPCS) program. By the end of this decade, they've asked for huge leaps such as improving real versus peak application performance by a factor of 10x to 40x, and reducing cost and time for developing solutions by 10x.

Into the Future

Sun's vision of HPC aligns well to meet the needs of the U.S. government and the greater industrial community. Our vision includes systems scaled from thousands to tens of thousands of processors working in an efficient, simple and highly resilient manner. These systems would be able to churn out results that will help lead to new discoveries and provide competitive advantages with relatively little manpower or exceptional programming expertise required, using open source software tools developed by the community.

Impossible? We don't think so. As a Phase II participant in DARPA's HPCS program, Sun has put together an amazing team of engineers and innovators. Led by Sun Fellow and vice-president Jim Mitchell, our "Hero" program (which got its name when Sun Fellow Ivan Sutherland commented that we are undertaking "an effort to build a system of heroic proportions") has been heads-down designing this revolutionary leap forward in productivity.

Over the last four years we have developed fully integrated system designs based on innovative new hardware and software technologies that we are confident can indeed make this leap. With an emphasis on delivering high levels of productivity to the developer, the system administrator and the facility operator, our research has led us to appreciate the value that massive bandwidth brings to the table - value that translates to increased productivity. Enabling features include globally addressable memory, system level and application checkpointing in combination with hardware and software telemetry for dramatically improved fault tolerance. Advanced features such as these make the system appear more like a flat memory system and allow the developer to focus on solving the problem at hand rather than making elaborate efforts to distribute data in a robust manner.

Proximity Communications

To achieve our goals, our Very Large Scale Integration (VLSI) Research Group at Sun Labs has been working on innovative technologies to radically improve the bandwidth and latencies associated with chip-to-chip communications. One technology we are looking at is capacitive coupling, which enables high-speed data communication between neighboring chips without the need for wires of any kind. This technology, which we call Proximity Communication, allows for the alignment of metal plates on one chip with metal plates on a neighboring chip and the transfer of data between them with reduced power and with bandwidths and latencies approaching those in native on-silicon communications. The result is comparable to wafer scale integration, but is accomplished by aligning together many small, tested chips. Connecting these chips using Proximity Communication not only reduces system latency but also improves cross-section bandwidth and communication power. A critical area of Sun's work in this area has been the design of system architectures that can best capitalize on this technology.
 
The ability to connect chips in this fashion is only one part of the bandwidth solution, however. In order to break out of the physical limitations of the X/Y dimension into another plane, we have added breakthrough optical communications technology to the mix.

Silicon Photonics

Page:  1  of  3
1 | 2 | 3   All  »  

HPCwire on Twitter

Article Tools

  • Print This Page
  • Bookmark This Article

Share Options

(Digg, Technorati, more)


Subscribe

Discussion

There are 0 discussion items posted.  

HPC in the Cloud Part 2
People to Watch 2010


Top Headlines

AMD: OEMs primed for Opteron 6100s

Mar 17 | The Register | But what about the tier ones? Read more...

Arrival of the Desktop Supercomputer

Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...

Scheduling HPC In The Cloud

Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...

Tailoring Medicine with Supercomputers

Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...

Gelsinger Stuns Analysts and Colleagues with Storage Pool Plan

Mar 15 | The Register | EMC's grand vision for unified global storage. Read more...

Featured Whitepapers

Virtualization for Aggregation And The vSMP Architecture™

Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.

Copper Cable Technologies for High Performance Computing

Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.

Multimedia

Webcast: Virtualized Data Center Roundtable

Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.

Webcast: Watch SC09 Birds of a Feather Video: Scalable Fault-Tolerant HPC Supercomputers

Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.

Webcast: High Performance Computing for a Smarter Planet

LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html

SC09 HPC in the Cloud

Newsletters

Stay informed! Subscribe to HPCwire email Newsletters.






HPC Job Bank


Featured Events

HPC User Forum DICE
2010 High Performance Computing Linux Financial Markets
Cloud Computing Expo
Cloud Lab
ESC
DEISA PRACE Symposium