2015 HPCwire Awards – Readers’ & Editors’ Choice

RCABanner


Every year, HPCwire conducts our annual Readers’ Choice Awards to recognize the best and the brightest developments that happened in HPC over the last 12 months. These awards, which are nominated and voted on by the global HPC community are announced and presented during the annual Supercomputing Conference, this year being held in Austin, Texas.

These annual awards are highly coveted as prestigious recognition of achievement by the HPC community, and are given in the form of both “Readers’ Choice” and “Editors’ Choice. We’d like to extend our congratulations to this year’s winners.

(Note: Please use your scroll wheel to navigate the categories on the left)


  • Best Use of HPC Application in Life Sciences
  • Best Use of HPC Application in Manufacturing
  • Best Use of HPC Application in the Oil and Gas Industry
  • Best Use of HPC in Automotive
  • Best Use of HPC in Financial Services
  • Best Use of HPC in Entertainment
  • Best Use of HPC in the Cloud
  • Best Use of High Performance Data Analytics
  • Best Application of 'Green Computing' in HPC
  • Best HPC Server Product or Technology
  • Best HPC Storage Product or Technology
  • Best HPC Software Product or Technology
  • Best HPC Visualization Product or Technology
  • Best HPC Interconnect Product or Technology
  • Best HPC Cluster Solution or Technology
  • Best Data-Intensive System (End User focused)
  • Best HPC Collaboration Between Government & Industry
  • Best HPC Collaboration Between Academia & Industry
  • Top Supercomputing Achievement
  • Top 5 New Products or Technologies to Watch
  • Top 5 Vendors to Watch
  • Workforce Diversity Leadership Award
  • Outstanding Leadership in HPC

Reader’s Choice

Tom Tabor with Kathy Yelick (LBNL) , Evangelos Georganas (UC Berkeley), Lenny Oliker (LBNL), & Rob Egan (Joint Genome Institute)

Tom Tabor & Peter Ungaro

Tom Tabor & Peter Ungaro

Lawrence Berkeley National Lab‘s Joint Genome Institute and UC Berkeley for boosting the assembly of the human genome on the Cray XC30 “Edison” supercomputer

A team from the Joint Genome Institute at Lawrence Berkeley National Lab and researchers from UC Berkeley have used 15,000 cores on the Cray XC30 “Edison” supercomputer to boost the complete assembly of the human genome, bringing the time down to 8.4 minutes. This is an impressive application of modern parallel algorithm design and programming — tools that generally will be critical to the optimal use of HPC systems to solve large-scale problems that benefit society.


Editor’s Choice

Tom Tabor with Michael Levine, Ralph Roskies, Leila Haidari, Shawn Brown & Eli Zenkov of PSC

The HERMES Logistics Modeling Team (Johns Hopkins, Pittsburgh Supercomputing Center, and International Vaccine Access Center) for using computational modeling and simulation software to help the Republic of Benin in West Africa determine how to bring more lifesaving vaccines to its children

The HERMES Logistics Modeling Team, consisting of researchers from Pittsburgh Supercomputing Center (PSC) and the Johns Hopkins Bloomberg School of Public Health and International Vaccine Access Center (IVAC), have used HERMES, their public health product supply chain modeling software, to help the Republic of Benin in West Africa determine how to bring more lifesaving vaccines to its children. The team reported its findings in the journal “Vaccine.”  This marks a seminal achievement in HPC, as the computational modeling directly led the country to redesign the immunization supply chain to lower costs and ensure that illness and death due to vaccine-preventable diseases are averted.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

Dassault Systèmes accelerates SIMULIA Abaqus simulations using NVIDIA GPUs

GPU support for solvers widely used in manufacturing for noise & vibration analysis offers a performance boost for natural frequency extraction in models that have 5,000 or more modes.  For example, in the case of a full vehicle model with 20M degree-of-freedom and over 10,000 nodes, the simulations run 50 percent faster using NVIDIA GPUs. Frequency response simulation is used to better understand the operation of a system or a vehicle under continuous harmonic loading – here the solver shows ground-breaking performance with NVIDIA GPUs. For a car model with ~10M DOF with ~2,000 modes, GPUs help the frequency response solver run three times faster than a 16-core CPU-only system.


Editor’s Choice

(TIE) PING Golf’s use of HPC software from CAE solvers for optimizing design to HPC portals for managing compute-intensive job submission to remote visualization for collaboration with the latter, trimming a day from the design cycle

A leader in golf club manufacture, PING Golf is known for its innovative designs and design approach. The company uses a full suite of HPC software, from CAE solvers for optimizing design to HPC portals for managing compute-intensive job submission to remote visualization for collaboration on design. The company reports its use of remote visualization is “[its] most significant functionality update since the move to HPC itself” and that they are doubling their efficiency. “In most cases we are saving a full day of time or more,” they add. “We get instant feedback, make the change and have enough time to view the results of those changes the same day.”

Left to Right: Anil Goli (GM), Appalaswamy Akasapu (GM), Vipin Patney (GM), Alex Bouzari (DDN), Whitney Wickesberg (GM), Paul Bloch (DDN) and Tom Tabor

(TIE) General Motors for improving car safety and streamlining design development with faster and more accurate crash test simulations using DDN storage and high-performance computers for analysis

General Motors is ushering in a new era of car safety and streamlining design development with faster and more accurate crash test simulations using DDN storage and high-performance computers for analysis. The HPC infrastructure is leveraged strategically by hundreds of GM engineers working across multiple test, backup and production applications and GM consolidated Computer Aided Engineering (CAE) systems to accelerate new vehicle design and delivery.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

Behavioral Recognition Systems Labs for leveraging NVIDIA GPUs + CUDA to extend the natural language processing system for up/mid/downstream domains, enabling automatic identification of anomalies

Behavioral Recognition Systems Labs (BRS Labs) leverages NVIDIA GPUs and CUDA to extend the natural language processing system for up/mid/downstream domains for major oil companies. The resulting detailed data includes anomalies of dates and event detection without human intervention, which increases efficiency and reduces risk. Intelligent solutions for increased safety, security and operational efficiency within oil and gas operations are possible for BRS Labs due to NVIDIA’s deep learning capabilities. The end result for energy companies is an overall savings of resources and potential reputation issues from accidents.


Editor’s Choice

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor & Peter Ungaro

Tom Tabor & Peter Ungaro

Tom Tabor with Devin Jensen, Bill Nitzberg and Mike Kidder of Altair

Intel, Cray and Altair for their collaboration on a subsea riser simulation solution used to perform advanced subsea CFD

Intel, Cray and Altair collaborated to benchmark a subsea riser simulation solution that gives engineers the computational systems they need to perform advanced subsea computational fluid dynamics (CFD) analysis with better speed, scalability and accuracy. With Altair’s AcuSolve CFD solver running on Cray XC supercomputer systems powered by Intel Xeon processors, operators and engineers responsible for riser system design and analysis can increase component life, reduce uncertainty and improve the overall safety of ultra-deep-water systems while still meeting their demanding development schedules. The study results demonstrate an ability to speed up simulations significantly, even at 4000+ cores; achieve a 20x L/D ratio increase on a single Cray XC cabinet; and design longer-lasting riser systems with better performance and integrity.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

(TIE) NVIDIA DRIVE PX, the GPU-powered auto-pilot car computer, provides new capabilities for self-driving cars

NVIDIA DRIVE PX is a new powerful auto-pilot car computer that is enabling self-driving cars. Leveraging neural network models trained on HPC servers powered by NVIDIA GPUs, DRIVE PX gathers environmental data from a car’s on-board high resolution cameras and sensors – road signs, vehicle types, pedestrians, parking lanes, and more – analyzes it instantaneously, and provides navigational directions to the car. For example, DRIVE PX can search for an open spot in the crowded garage and slip into it with an expert series of turns. Then, with a press of a button, it guides the car out of the space and back to the driver. German auto manufacturer Audi is evaluating DRIVE PX for future models.

Tom Tabor with Alex Bouzari and Paul Bloch

Tom Tabor with Alex Bouzari and Paul Bloch

Tom Tabor with Bastian Koller (HLRS), Michael Resch (HLRS) and Paul Bloch (DDN)

(TIE) HLRS’s Automotive Simulation Center (ASCS) for running over 1,000 crash simulations within 24 hours, leveraging DDN storage

HLRS’s Automotive Simulation Center (ASCS) is demonstrating the commercial benefits of data-intensive computing for industrial manufacturing and commercial big data computing. Running more than 1,000 crash simulations within 24 hours helped deliver hyper-performance designs faster than previously conceived possible. The Simulation Center leveraged DDN storage to overcome the I/O bottleneck challenges that were previously limiting the ability to increase simulations in computational fluid dynamic (CFD) workflows.


Editor’s Choice

Ford uses HPC to minimize F-150 weight by over 400 lbs

Ford used an integrated suite of CAE design and HPC optimization tools to minimize weight of the 2015 F-150 and was able to reduce the weight by 400 pounds. Ford engineers took a holistic approach to weight reduction by incorporating advanced materials into the entire vehicle design, including frame, body, powertrain, battery and interior features such as seats. The weight savings help the truck tow more, haul more, accelerate quicker and stop shorter, and improve fuel efficiency. In all, over 400 pounds were reduced from the vehicle’s body. For these accomplishments Ford won the 2015 Enlighten Award which acknowledges innovation in vehicle weight reduction. Ford also uses commercial HPC workload management to optimize utilization and performance of its massive HPC systems.

Reader’s Choice

Tom Tabor & Ken Claffey

Tom Tabor & Ken Claffey

Seagate developed a hyper-scale architecture that supports Cleversafe, Scality and SwiftStack, creating a software-defined object tier to facilitate data protection, archive, and collaboration for critical financial services applications

Seagate has been applying its experience working with cloud service providers worldwide to help meet the performance and scale needs of next-generation workloads. Seagate Cloud Systems and Solutions developed a hyper-scale architecture that supports Cleversafe, Scality, and SwiftStack, creating a software-defined object tier to facilitate data protection, archive, and collaboration for critical financial services applications.


Editor’s Choice

Tom Tabor with Karuna Chelmella, Alex Bouzari, Andrei Vakhnin, John Eubanks, Paul Bloch and Michael Chazot of Fannie Mae and DDN

Tom Tabor with Karuna Chelmella, Alex Bouzari, Andrei Vakhnin, John Eubanks, Paul Bloch and Michael Chazot of Fannie Mae and DDN

DDN storage helps Fannie Mae integrate data silos across the organization

Fannie Mae, the organization established to provide financial products and services that increase the availability and affordability of housing for low-, moderate- and middle-income Americans, integrated data silos across the organization while enabling the delivery of 452 percent acceleration for its most complex SAS Grid workflows in part using DDN storage technology.

Reader’s Choice and Editor’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

NVIDIA GPU technology for the seventh consecutive year was used in every film nominated by the Academy of Motion Pictures Arts & Sciences for the Oscar for Best Visual Effects including the 2015 winner, Interstellar

For the seventh consecutive year, every film nominated by the Academy of Motion Pictures Arts & Sciences for the Oscar for Best Visual Effects was powered by NVIDIA GPU technology, including Interstellar, which took home the 2015 award. Double Negative, Industrial Light & Magic (ILM), Framestore and WETA Digital are among the visual effects studios whose artists have been honored. All of them and others rely on NVIDIA GPUs to achieve groundbreaking visual effects on artist workstations and in render pipelines. Allowing visual effects artists to work faster, NVIDIA GPUs eliminate the lag time in previewing renders to determine whether or not a shot meets the director’s vision.

Reader’s Choice and Editor’s Choice

Tom Tabor with Guy Lonsdale and Mark Parsons

Fortissimo – the collaborative project that connects European SMEs with simulation services running on a high performance computing cloud infrastructure – has 53 case studies underway

Fortissimo is a collaborative project that enables European SMEs to be more competitive globally through the use of simulation services running on a high performance computing cloud infrastructure. The project is funded by the European Commission within the 7th Framework Programme and coordinated by the University of Edinburgh. It involves nearly 100 partners, among them manufacturing companies, application developers, domain experts, IT solution providers and HPC cloud service providers from 14 countries. These partners are engaged in 53 experiments (case studies) where business relevant simulations of industrial processes are implemented and evaluated. The ultimate objective is to provide a “one-stop-shop” that will greatly simplify access to advanced simulation.

Reader’s Choice

Tom Tabor with Michael Levine, Ralph Roskies, Philip Blood of PSC and Jeffrey Raymond and Kim Wong of University of Pittsburgh

(TIE) The Pittsburgh Genome Resource Repository of large de-identified national datasets, including The Cancer Genome Atlas (1.1PB), provides a portal that fosters HPC tool use to advance cancer research and personalized medicine

The Pittsburgh Genome Resource Repository (PGRR) is a leading-edge information technology resource for storing, accessing and analyzing large de-identified national datasets, including The Cancer Genome Atlas (TCGA) from NIH, all of which are important for personalized medicine. The PGRR provides a portal that allows use of these data easily with tools and HPC resources. As a managed environment, the PGRR helps researchers meet information security and regulatory requirements, provides a single consistent view of all datasets, and helps users stay current on updates and modifications made to these datasets. It facilitates the large-scale profiling of TCGA’s 1.1 PB of data, comprising data on tumor samples from 11,000 cancer patients, to better understand genetic pathways and eventually enable personalized cancer treatments.

Mark Wilkinson (DiRAC) with Tom Tabor and Juha Jaykka, James Briggs and Paul Shellard skyping in from the UK

(TIE) Stephen Hawking Centre for Theoretical Cosmology, Cambridge University, DiRAC STFC uses the first Intel Xeon Phi-enabled SGI UV2000 with its co-designed ‘MG Blade’ Phi-housing to achieve 100X speed-up of MODAL code to probe the Cosmic Background Radiation

The COSMOS supercomputing facility at the Stephen Hawking Centre for Theoretical Cosmology, Cambridge University, has enjoyed a long industrial collaboration with SGI and Intel. COSMOS hosts the world’s first Intel Xeon Phi-enabled SGI UV2000 with its co-designed ‘MG Blade’ Phi-housing. This collaboration led to modernizing the cosmological code ‘WALLS’ for Phi, achieving impressive speed-ups. A second project “MODAL”, a Planck satellite pipeline, resulted in further speed-ups described in a chapter in “High Performance Parallel Pearls 2” and in a specialized research paper. The accelerated pipeline crosses qualitative thresholds in capability, opening up insights into potential new physics in the early universe.


Editor’s Choice

Tom Tabor with Dominic Borkowski, Bill Marmagas, Mandy Wilson, Keith Bisset, Kevin Shinpaugh, Chris Kuhlman of VBI

Virginia Bioinformatics Institute’s (VBI) data analytics platform can simulate the entire US population in seconds versus an hour, helping to contain influenza outbreaks and optimizing placement of treatment centers for the Ebola outbreak

In the middle of an infectious disease outbreak, decision makers can’t wait until the outbreak is over to look at history and trends. Virginia Bioinformatics Institute’s (VBI) leading edge data analytics leverage HPC technology to simulate biological systems, faster. Leveraging HPC technology from DDN, VBI can now simulate the entire US population in just seconds, whereas back in 2005, it took an hour to simulate just the population of Chicago. This massive acceleration in analytics capabilities is used to contain influenza outbreaks and most recently, the U.S. Department of Defense called on VBI, to help in placement of treatment centers for the Ebola outbreak. VBI was able to rapidly predict outbreak patterns for the highly contagious disease, enabling the government and healthcare providers to respond quickly and save lives. VBI has created an ever-expanding synthetic global population for computational models to predict the spread of outbreaks and the efficacy of potential interventions.

Reader’s Choice

Tom Tabor with Dieter Kranzlmueller and Carla Beatriz Guillen Carias of Leibniz Supercomputing Centre

Leibniz Computer Centre Munich’s use of warm-water cooling to make free-air cooling is possible and its use of deionized water to minimize the effects of leaks

At Leibniz Supercomputing Centre, green IT has been an important topic for many years and has evolved into one of LRZ’s central research fields. In addition to the compute, network, and storage components, the energy efficiency of the cooling and air conditioning systems contribute significantly to the overall energy consumption in a datacenter. SuperMUC, which entered operation in 2012, continues to be one of the most energy efficient supercomputers in the world. The proven hot water cooling technology, implemented by IBM, was also applied to Phase 2 of the installation, which came online in June 2015. Through a network of micro channels, the cooling system circulates 45 centigrade warm water over active system components, such as processors and memory, to dissipate heat. Thus, no additional chillers are needed. The use of the latest processors, which allow an adaptation of their frequency to the specific needs of the computations, adds to the efforts of reducing the power usage. In combination with the use of energy optimizing operating software these energy saving measures result in an overall reduction of system power usage by approximately 40 percent, according to the center.


Editor’s Choice

Tom Tabor and Motoaki Saito

Japanese manufacturing partners PEZY Computing and Exascaler Inc. for claiming the top three spots on the June 2015 Green500 list

The top three systems on the June 2015 list were the result of a collaborative effort between fabless Japanese startup PEZY and immersion cooling company Exascalar. All three systems have the distinction of being the first 6+ gigaflops-per-watt entries on the TOP500/Green500 lists. The Shoubu supercomputer from Riken, the new record-holder, went even further, achieving 7.03 gigaflops-per-watt. Shoubu was followed closely by two machines from the High Energy Accelerator Research Organization (KEK): Suiren Blue, which took second place with 6.84 gigaflops-per-watt, and Suiren, which claimed third place with 6.22 gigaflops-per-watt.

All were built using PEZY’s second generation 1,024 core custom MIMD processor and Exascalar’s submersion liquid cooling technology. The lead machine Shoubu employed ExaScaler second-generation technology along with Intel’s Xeon E5-2618L v3 (8 cores / 16 threads, 2.3GHz ~ 3.4GHz) processor, equipped with 64GB memory and InfiniBand FDR. The “PEZY-SC” accelerator processor is said to offer 3 teraflops single-precision and 1.5 teraflops double-precision performance.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

(TIE) NVIDIA Tesla K80 GPU accelerator

NVIDIA’s Tesla K80 accelerator card is based on the “Kepler” family of GPUs and powered by the CUDA parallel computing model. Launched at SC14, the twin GPU Tesla card was designed with the most difficult computational challenges in mind, ranging from astrophysics, genomics and quantum chemistry to data analytics. The K80 is also optimized for advanced deep learning tasks, one of the fastest growing segments of the machine learning field. Based on a pair of GK210 chips, each accelerator card provides 4,992 CUDA cores, 24 GB of memory, and 480 GB/sec of memory bandwidth. With its GPU Boost overclocking mechanism, the Tesla K80 delivers up to 8.74 teraflops single-precision and up to 2.91 teraflops double-precision peak floating point performance, and, according to NVIDIA, offers up to 10 times higher performance than today’s fastest CPUs on leading science and engineering applications, such as AMBER, GROMACS, Quantum Espresso and LSMS.

Tom Tabor & Peter Ungaro

Tom Tabor & Peter Ungaro

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

(TIE) Cray XC40 supercomputer with collaboration from Intel

Cray and Intel are working together to create innovative supercomputer solutions that feature performance, scalability and reliability. Leveraging the Intel Xeon processor roadmap, Aries high performance interconnect and flexible Dragonfly network topology, the Cray XC40 supercomputer provides low latency and scalable global bandwidth to satisfy challenging multi-petaflops applications. The XC40 series architecture implements two processor engines per compute node with four compute nodes per blade. Compute blades stack in eight pairs (16 to a chassis), and each cabinet can be populated with up to three chassis.


Editor’s Choice

Tom Tabor & Peter Ungaro

Tom Tabor & Peter Ungaro

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Cray XC40 supercomputer with collaboration from Intel

Cray and Intel are working together to create innovative supercomputer solutions that feature performance, scalability and reliability. Leveraging the Intel Xeon processor roadmap, Aries high performance interconnect and flexible Dragonfly network topology, the Cray XC40 supercomputer provides low latency and scalable global bandwidth to satisfy challenging multi-petaflops applications. The XC40 series architecture implements two processor engines per compute node with four compute nodes per blade. Compute blades stack in eight pairs (16 to a chassis), and each cabinet can be populated with up to three chassis.

Reader’s Choice

Tom Tabor with Alex Bouzari and Paul Bloch

Tom Tabor with Alex Bouzari and Paul Bloch

(TIE) DataDirect Networks’ EXAScaler 7000 (ES7K)

The DDN EXAScaler 7000 (ES7K) leverages Lustre 2.5 pre-installed and fully optimized for DDN’s high-density storage appliances. ES7K delivers enterprise-level features including simple management and monitoring interfaces, and HPC-level performance that optimizes both mixed and parallel I/O performance. Built specifically for organizations with data-intensive, high-performance storage requirements, the ES7K provides a way to leverage the benefits of DDN’s high-performance storage architecture with open-source Lustre file systems.

Tom Tabor & Ken Claffey

Tom Tabor & Ken Claffey

(TIE) Seagate ClusterStor 1500

Seagate continues be a champion of delivering an enterprise-like experience for those new and established in HPC by leveraging its core storage technologies to provide differentiated scaleable storage solutions. With ClusterStor 1500, Seagate has built a robust, high throughput storage appliance for HPC. Features of ClusterStor 1500 include departmental leading performance up to 110GB/s and raw capacity up to 7.3 PB, scale-out storage building blocks, the Lustre parallel filesystem and a comprehensive management platform. The solution was purpose built to satisfy data-intensive department level compute cluster needs, and designed to provide best in class scale-out storage for middle tier high performance computing environments.


Editor’s Choice

Tom Tabor with Alex Bouzari and Paul Bloch

Tom Tabor with Alex Bouzari and Paul Bloch

DataDirect Networks’ EXAScaler 7000 (ES7K)

The DDN EXAScaler 7000 (ES7K) leverages Lustre 2.5 pre-installed and fully optimized for DDN’s high-density storage appliances. ES7K delivers enterprise-level features including simple management and monitoring interfaces, and HPC-level performance that optimizes both mixed and parallel I/O performance. Built specifically for organizations with data-intensive, high-performance storage requirements, the ES7K provides a way to leverage the benefits of DDN’s high-performance storage architecture with open-source Lustre file systems.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

NVIDIA CUDA development platform

CUDA, which stands for Compute Unified Device Architecture, is a parallel computing platform and application programming interface (API) model created by NVIDIA. It allows software developers to use a CUDA-enabled graphics processing unit (GPU) for general purpose processing – an approach known as GPGPU. The CUDA platform enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU). The CUDA Toolkit includes a compiler, math libraries and tools for debugging and optimizing the performance of applications, as well as code samples, programming guides, user manuals, API references and other documentation. With millions of CUDA-enabled GPUs sold to date, software developers, scientists and researchers are finding broad ranging uses for GPU computing with CUDA.


Editor’s Choice

Tom Tabor with presents OpenACC’s award to Jay Gould (Cray) and Michael Wolfe (PGI)

OpenACC

OpenACC is a parallel programming model for heterogeneous CPU/GPU systems (by Cray, CAPS, NVIDIA, and PGI). The OpenACC Application Program Interface describes a collection of compiler directives to specify loops and regions of code in standard C, C++ and Fortran to be offloaded from a host CPU to an attached accelerator. OpenACC is designed for portability across operating systems, host CPUs, and a wide range of accelerators, including APUs, GPUs, and manycore coprocessors. The directives and programming model defined in the OpenACC API document allow programmers to create high-level host+accelerator programs without the need to explicitly initialize the accelerator, manage data or program transfers between the host and accelerator, or initiate accelerator startup and shutdown.

Reader’s Choice

Tom Tabor with Andrea Rodolico, Paolo Borelli, Nicola Venuti and Dave Moore of NICE Software

(TIE) NICE Desktop Cloud Visualization (DCV)

NICE DCV enables multiple simultaneous users to remotely access 2D/3D interactive applications and share available GPUs over a standard LAN or WAN networks. NICE has released a new version of DCV that sets new performance standard for 3D Applications in the Cloud. DCV uses H.264 compression and leverages the Kepler graphics architecture of NVIDIA GRID technology to provide significant acceleration for image encoding. DCV GPU-accelerated compression allows high frame rates, improved accuracy, and reduction in overall bandwidth usage of the visualization stream, to the extent that 3D visualization is now possible over low bandwidth links. DCV’s value is recognized by flagship HPC vendors (such as AWS, ANSYS and SGI) who use it as a key component in their offerings.

Tom Tabor with Berk Geveci

(TIE) ParaView

ParaView is an open-source, multi-platform data analysis and visualization application that has become an integral tool in many national laboratories, universities and industry. ParaView users can quickly build visualizations to analyze their data using qualitative and quantitative techniques. The data exploration can be done interactively in 3D or programmatically using ParaView’s batch processing capabilities. ParaView was developed to analyze extremely large datasets using distributed memory computing resources. It can be run on supercomputers to analyze datasets of petascale size as well as on laptops for smaller data.


Editor’s Choice

sgi_2

Tom Tabor with Gabriel Broner and Jorge Titinger of SGI

Tom Tabor with Andrea Rodolico, Paolo Borelli, Nicola Venuti and Dave Moore of NICE Software

SGI VizServer system with NICE DCV software

SGI VizServer system with NICE DCV software is a fully integrated hardware, software and services solution from SGI that delivers efficient and optimized remote access to graphics-intensive, off-the-shelf 3D applications running on both Windows and Linux desktop environments, including major CAD, CAE, petro-technical, medical and scientific visualization software. NICE DCV is the first software product on the market to allow sharing of a single physical GPU between multiple Windows and Linux sessions, while maintaining full OpenGL application acceleration and workstation-class performance: this makes the SGI VizServer solution an excellent choice for users’ remote working and collaboration needs. SGI VizServer server with NICE DCV is integrated into NICE EnginFrame Views to provide 2D/3D session management via a Web browser including the ability to share an interactive session with others for collaborative working. When coupled with EnginFrame HPC functionalities, engineers and researchers benefit from a user-friendly, Web-based experience across their complete workflow, including state-of-the-art data and batch job management using their job scheduler of choice.

Reader’s Choice and Editor’s Choice

mellanox_2

Tom Tabor with Michael Kagan, Eyal Waldman and Gilad Shainer

Mellanox 100Gb/s EDR InfiniBand Technology

Switch-IB is the seventh generation switching IC from Mellanox, delivering 36-ports of 100Gb/s throughput per port. Switch-IB is the world’s highest capacity switch, enabling application managers to use the power of data. With 144 integrated SerDes (Serializer/Deserializer), which can operate at 1 Gb/s to 25 Gb/s speeds per lane to deliver 7.2Tb/s of switching capacity and 5.4 billion packets per second, Switch-IB enables high-performance computing, cloud, Web 2.0, database, and storage centers to deliver high application performance. ConnectX-4 offers the highest throughput VPI adapter, supporting EDR 100Gb/s InfiniBand and 100Gb/s Ethernet, delivering high bandwidth, low latency, and high computation efficiency for high-performance computing clusters.

Reader’s Choice

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Intel MPI Library

Intel MPI Library 5.1 focuses on making applications perform better on Intel architecture-based clusters—implementing the high performance Message Passing Interface Version 3.0 specification on multiple fabrics. It enables developers to quickly deliver maximum end user performance even if they change or upgrade to new interconnects, without requiring changes to the software or operating environment. This high performance MPI message library is used to develop applications that can run on multiple cluster interconnects chosen by the user at runtime. An optimized shared memory path for multicore platforms allows more communication throughput and lower latencies. Native InfiniBand interface (OFED verbs) also provides support for lower latencies.


Editor’s Choice

Tom Tabor with Matthijs van Leeuwen and Martijn de Vries

Bright Cluster Manager for HPC

Bright Cluster Manager for HPC enables customers to deploy complete HPC clusters over bare metal or in the cloud, and manage them effectively. Bright Cluster Manager for HPC provides a single pane of glass management for the hardware, the operating system, HPC software, and users. With Bright Cluster Manager for HPC, system administrators can get their clusters up and running quickly and keep them running reliably throughout their life cycle – all with the ease and elegance of a fully featured, enterprise-grade cluster manager. The latest version has added more completeness from bare metal to the top of the application stack, enhanced workload management capabilities, and supports for a rich collection of HPC software and an updated tool chain for HPC developers.

Reader’s Choice

Tom Tabor with Weijia Xu, Niall Gaffney and Christopher Jordan of TACC

Texas Advanced Computing Center (TACC) Wrangler supercomputer

Wrangler is the most powerful data analysis system allocated in XSEDE. The system is designed for large-scale data transfer, analytics, and sharing and provides flexible support for a wide range of software stacks and workflows. Its scalable design allows for growth in the number of users and data applications. Dell/ EMC (since merged) are the two strategic partners providing the technology that make up the core of Wrangler. Wrangler uses EMC’s DSSD rack-scale flash technology to ensure speed and performance, enabling real-time analytics at scale. Wrangler will drive the next generation programming models of HPC applications for exascale class systems, is portable and alleviates dependencies for proprietary communications stacks.


Editor’s Choice

Tom Tabor and Ian Foster

(TIE) University of Chicago Globus Service for moving and sharing data

Globus has become a preferred service for moving and sharing research data on a wide variety of HPC and campus computing resources. With the recent release of data publication and discovery capabilities, Globus now provides useful tools for managing data at every stage of the research lifecycle. Usage recently passed the 100 petabyte mark, with the service being used at over 200 R&E institutions in the US and abroad. Globus manages file transfer, monitoring performance, retrying failures, auto-tuning and recovering from faults automatically where possible, and reporting status. There’s no custom infrastructure or software to install.

SDSC Director Michael Norman (third from right), joins Comet launch team members in accepting HPCwire’s Editors’ Choice award for the Center’s ‘Comet’ supercomputer from HPCwire publisher Tom Tabor (far right) at the Supercomputing 2015 (SC15) conference in Austin, TX.

(TIE) San Diego Supercomputer Center (SDSC) Comet supercomputer, a science gateway serving thousands through simple, domain-specific Web interfaces.

Comet, based at the San Diego Supercomputer Center, is designed to meet the needs of what is often referred to as the “long tail” of science – the idea that the large number of modest-sized computationally-based research projects represent, in aggregate, a tremendous amount of research that can yield scientific advances and discovery. Comet is a dedicated XSEDE cluster designed by Dell and SDSC delivering ~2.0 petaflops. It features Intel next-gen processors with AVX2, Mellanox FDR InfiniBand interconnects, and Aeon storage. The standard compute nodes consist of Intel Xeon E5-2680v3 processors, 128 GB DDR4 DRAM (64 GB per socket), and 320 GB of SSD local scratch memory. The GPU nodes contain four NVIDIA GPUs each. The large memory nodes contain 1.5 TB of DRAM and four Haswell processors each. The network topology is 56 Gbps FDR InfiniBand with rack-level full bisection bandwidth and 4:1 oversubscription cross-rack bandwidth. Comet has 7 petabytes of 200 GB/second performance storage and 6 petabytes of 100 GB/second durable storage. It also has dedicated gateway/portal hosting nodes and a Virtual Image repository. External connectivity to Internet2 and ESNet is 100 Gbps.

Reader’s Choice

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor with Jason Stowe (Cycle Computing), Hoot Thompson (NASA), Garrison Vaughan (IT Coalition, Inc.), Daniel Duffy (NASA), Tim Carroll (Cycle Computing) and Jamie Baker (AWS)

NASA study of climate change in the Sahara using HPC working with AWS, Cycle Computing, and Intel

As part of a Head in the Clouds program with AWS, Cycle Computing, and Intel, NASA is leveraging cloud computing to study the carbon stored in trees and bushes over the Sahara. An initial test run using 8 TBs of data leveraged about 200 virtual machines running non-stop to complete 124,800 total compute hours of simulations in about one month. The immediate scalability of cloud proved that the total dataset of 80 TBs (10x the initial test run) could be completed in the same amount of time it took 8 TBs of data. The program is proving how cloud computing can accelerate time to result.


Editor’s Choice

DOE and IBM, NVIDIA and Mellanox collaboration to build Summit/Sierra supercomputers

The U.S. Department of Energy is building two GPU-accelerated supercomputers – expected to deliver at least three-times greater performance than today’s most powerful system – moving closer to reaching exascale computing levels. Summit at Oak Ridge National Laboratory will deliver 150 to 300 peak petaflops; Sierra at Lawrence Livermore National Laboratory will perform well in excess of 100 peak petaflops. These systems will be based on next-generation IBM POWER servers with NVIDIA Tesla GPU accelerators and the NVIDIA NVLink high-speed GPU interconnect technology, and will be considerably faster than the U.S. current number one system, Oak Ridge’s Titan (27 peak petaflops).

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

Tom Tabor and Ken King

Tom Tabor with Michael Kagan, Eyal Waldman and Gilad Shainer

Tom Tabor presents the Editors' Choice Award to Buddy Bland of ORNL

Tom Tabor presents the Editors’ Choice Award to Buddy Bland of ORNL

Reader’s Choice

Tom Tabor with Orion Pineda (BSC), Alison Kennedy (EPCC), Florian Berberisch (Forschungszentrum Juelich GMBH) and Stephane Requena (GENCI)

PRACE’s HPC access program for industrial and scientific projects which so far has awarded more than 50 industrial projects for a total of almost 400 million core hours

PRACE opened its HPC access to industrial projects three years ago. Industrial and scientific projects alike are awarded based on their scientific excellence and access to PRACE resources is free of charge, under the condition that the industrial user publishes all results at the end of the project. So far, PRACE has awarded more than 50 industrial projects for a total of almost 400 million core hours to close to 40 different companies in various domains including automotive, aeronautics, biology, pharmaceutics, finance/insurance, energy, materials, digital media, etc. Beyond access to HPC resources, industrial users can also benefit from access to training (trough the 6 PRACE Training Centers) and tailored HPC evangelization program for SMEs (the SHAPE program attracted 21 SMEs).


Editor’s Choice

Tom Tabor with Ahmed Taha and Merle Giles of NCSA

NCSA‘s Private Sector Program

NCSA’s Private Sector Program (PSP) strives to engage industry and expand access to, awareness of, and training for HPC resources. NCSA has worked with more than one-third of the Fortune50, in sectors including manufacturing, oil and gas, finance, retail/wholesale, bio/medical, life sciences, astronomy, agriculture, technology, and more. NCSA’s Private Sector Program currently boasts 26 partners. PSP’s core mission is to help its partner community gain a competitive edge through expert use of modern, high-performance digital and human resources. Traditional projects are now complemented by: dedicated, non-government high-performance computing resources, including the iForge cluster; a high-tech, mobile consulting team; software/hardware benchmarking and development in production environments; code-performance teams; public-private partner leadership, and blended partner applied research and development.

Reader’s Choice

Tom Tabor presenting awards to Greg Estes and the NVIDIA team

NVIDIA GPUs enable 20 of the top 25 systems on June 2015 Green500 list.

Restraining power consumption as compute power grows is one of the thorniest challenges in supercomputing today. Environmental impact, cost, and the sheer practicality of generating enough power are all compelling issues. NVIDIA GPU leadership has been clearly demonstrated with its record of enabling 20 of the top 25 systems – 80 percent – on the June 2015 Green500 List. Indeed, heterogeneous accelerator-based systems dominate the top places of the Green500 and NVIDIA has been preeminent in delivering both computing performance and lower power consumption. In the November 2014 edition of the list, the top 23 supercomputers on the Green500 list used accelerators; whereas with the most recent edition of the list (June 2015), the top 32 supercomputers made use of accelerators, a nearly 40 percent increase.


Editor’s Choice

Tom Tabor and Ryutaro Himeno

Tom Tabor and Motoaki Saito

The Shoubu supercomputer from RIKEN, built by PEZY Computing and partner Exascaler Inc., took the top spot on the June 2015 Green500 list, becoming the first TOP500-echelon system to surpass the seven gigaflops/watt milestone.

Supercomputers run very hot and require massive amounts of energy to cool. The Shoubu supercomputer at RIKEN, which topped the June 2015 Green500 list, is a heterogeneous (processor and accelerator) supercomputer that consists of Haswell CPUs from Intel, new manycore accelerators from PEZY-SC, and an energy-efficient software design. Shoubu achieved 7.03 gigaflops/watt, a 33 percent improvement over the greenest computer that topped the previous Green500 List in November 2014. Shoubu runs at a speed of around 2 petaflops, and uses a cooling method, developed by Exascaler, in which the computer is completely immersed in liquid. Motoyoshi Kurokawa of the RIKEN Advanced Center for Computing and Communication (ACCC), who supervised the development project notes, “This is the first time since 2007, when the award was established, that RIKEN has taken the first spot. It is very exciting for us that we have been able to demonstrate our institute’s commitment to building a sustainable future.” It is also the first winning system developed by Japanese companies.

Reader’s Choice

Intel Omni-Path Architecture

Intel Xeon Phi “Knights Landing” processors

Mellanox GPUDirect 4.0 with NVIDIA

Micron and Intel: 3D XPoint

NVIDIA NVLink high-speed GPU interconnect

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor presents awards to Greg Estes and the NVIDIA team

Tom Tabor with Michael Kagan, Eyal Waldman and Gilad Shainer

Tom Tabor and Tom Eby


Editor’s Choice

Intel Omni-Path Architecture

Intel Xeon Phi “Knights Landing” processors

Cavium‘s ThunderX 64bit ARMv8 Processor

Micron and Intel: 3D XPoint

NVIDIA NVLink high-speed GPU interconnect

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor presents awards to Greg Estes and the NVIDIA team

Tom Tabor with Syed Ali

Reader’s Choice

Cray, Inc.

Intel Corp.

Mellanox Technologies

NVIDIA Corp.

SGI

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor presents awards to Greg Estes and the NVIDIA team

Tom Tabor with Michael Kagan, Eyal Waldman and Gilad Shainer

Tom Tabor & Peter Ungaro

Tom Tabor & Peter Ungaro

Tom Tabor with Gabriel Broner and Jorge Titinger


Editor’s Choice

IBM

Intel Corp.

Mellanox Technologies

NVIDIA

D-Wave

Tom Tabor & Ken King

Tom Tabor & Ken King

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Tom Tabor with Michael Kagan, Eyal Waldman and Gilad Shainer

Tom Tabor presents awards to Greg Estes and the NVIDIA team

Tom Tabor & Bo Ewald

Tom Tabor & Bo Ewald

Reader’s Choice

Tom Tabor with Alison Kennedy (left) and Toni Collis (right)

Women in HPC (WHPC)

The Women in High Performance Computing (WHPC) network, based in the UK and supported by Edinburgh Parallel Computing Center (EPCC), fosters collaboration and networking by bringing together female HPC scientists, researchers, developers, users and technicians from across the UK. WHPC encourages women in HPC to engage in outreach activities and improve the visibility of inspirational role models. WHPC activities are complemented by research into the influence of UK equality initiatives on the HPC community. The WHPC network aims to build a community of female HPC scientists, technicians, researchers, users and academics to collaborate, share knowledge, mentor, addressing the poor representation of women in the subject and achieve equal representation for all in HPC.


Editor’s Choice

Tom Tabor & Charlie Wuischpard

Tom Tabor & Charlie Wuischpard

Intel Global Diversity and Inclusion

In January 2015, Intel committed $300 million to improving diversity and set a bold goal to achieve full representation of women and underrepresented minorities in the Intel workforce by 2020. Intel Chief Diversity Officer Rosalind Hudnell is focused on orchestrating the effort and instilling the culture company-wide. Moreover, many aspects of the program extend beyond Intel; for example Intel announced a goal of spending $1 billion with diversity-owned suppliers, also by 2020. Further, the company has created a $125 million Diversity Fund, the largest of its kind in industry, to invest in start-ups run by women and underrepresented minorities. As of August 2015, Intel said it is on track to achieve its overall hiring goal for the year.  The company was tracking at 43.3% diverse hires in 2015, exceeding its goal in the United States of 40% for 2015.

Reader’s Choice

Tom Tabor and Jack Dongarra

(TIE) Jack Dongarra

A co-founder of the TOP500, it is difficult to overstate the Jack Dongarra’s impact on HPC. He is currently a University Distinguished Professor in the Electrical Engineering and Computer Science Department at the University of Tennessee and researcher at Oak Ridge National Lab. Dongarra has long been a champion of the need for algorithms, numerical libraries, and software for HPC, especially at extreme scale. His research includes the development, testing and documentation of high quality mathematical software.

Dongarra has contributed to the design and implementation of the following open source software packages and systems: EISPACK, LINPACK, the BLAS, LAPACK, ScaLAPACK, Netlib, PVM, MPI, NetSolve, TOP500, ATLAS, and PAPI, and the High Performance Conjugate Gradient Benchmark (HPCG). Here is a brief summary – no doubt incomplete – of awards that Dongarra has received: the IEEE Sid Fernbach Award in 2004; the first IEEE Medal of Excellence in Scalable Computing  (2008); the first recipient of the SIAM Special Interest Group on Supercomputing’s award for Career Achievement (2010); the IEEE IPDPS Charles Babbage Award (2011); and recipient of the ACM/IEEE Ken Kennedy Award (2013). He is a Fellow of the AAAS, ACM, IEEE, and SIAM and a member of the National Academy of Engineering.

Tom Tabor and Nishi Katsuya with Professor Satoshi Matsuoka

(TIE) Professor Satoshi Matsuoka

Satoshi Matsuoka is another pivotal figure in HPC whose accomplishments span a wide scope. He is currently professor at Tokyo Institute of Technology (TITech) and a leader of the TSUBAME series of supercomputers. Matsuoka built TITech into an HPC leader, adopting GPGPUs and other energy efficient techniques early on and helping put Japan on a course for productive exascale science.

Matsuoka’s research theme is “Object-Oriented Parallel Computing User interfaces Systems and Software Global Computing Environment”. He is a fellow of the ACM and European ISC, and has won many awards, including the JSPS Prize from the Japan Society for Promotion of Science (2006), the ACM Gordon Bell Prize (2011), the Commendation for Science and Technology by the Minister of Education, Culture, Sports, Science and Technology (2012), and IEEE Computer Society Sidney Fernbach Award (2014) for his work on software systems for high-performance computing on advanced infrastructural platforms, large-scale supercomputers, and heterogeneous GPU/CPU supercomputers.


Editor’s Choice

(TIE) Hans Meuer

It is fitting to recognize Hans Meuer best known as the “Father of European Supercomputing” who passed away in January 2014. Hans was the co-founder and organizer of Mannheim Supercomputer Conference in 1986, which became the International Supercomputing Conference in 2001. Working with Erich Strohmaier, Horst Simon and Jack Dongarra, Meuer started the TOP500. He was a long-time leader at Jülich Research Centre, and director of the computer center and professor for computer science at the University of Mannheim.

Tom Tabor, HPCwire founder and a longtime friend remembers, “[Hans was] the proverbial father figure of high performance computing in Europe, the quintessential professor – graying beard, German accent, passion for science and discovery and most importantly the desire to use the technology around him to improve humankind’s quality of life. For many of the industry’s forerunners, being on the bleeding edge of both technology and scientific research was all that was important, but Hans’ passion extended beyond that; he sought to bring together the great minds in Europe who could move high performance computing forward.”  Clearly he succeeded.

(TIE) George Michael

George Michael was another seminal figure in HPC and co-founder of annual ACM/IEE Supercomputing Conference first held in 1988. A pioneering computational physicist, he had a 41-year career at Lawrence Livermore Labor, helping to build the lab’s reputation as a leader in supercomputing. George passed away in 2008. He is remembered not only for his technical contributions, which are significant, but also as a community builder with a natural ability to attract and galvanize colleagues around important science goals and projects. SC, perhaps his biggest legacy, now attracts on the order of 10,000-plus attendees annually who represent the leading companies and research institutions in high performance computing from around the world.

editorialfeature

Share This