Top500 Results: Latest List Trends and What’s in Store

By Tiffany Trader

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major shakeups — China still has the top two spots locked with the the 93-petaflops TaihuLight and the 33.8 petaflops Tianhe-2 — there are some interesting historical and global trends to share, as well as notable Green500 results with Japan capturing the top four spots.

In the top ten strata of the 49th Top500 list, the names are the same but Piz Daint, the Cray XC50 system installed at the Swiss National Supercomputing Centre (CSCS), has moved up five positions from number eight to number three. The punched-up processing was provided thanks by replacing older Tesla gear with Nvidia Tesla P100 GPUs (see coverage here for more details), doubling the previous 9.8 Linpack petaflops score to 19.6 petaflops. The Intel processors were also upgraded: from Sandy Bridge to Haswell architecture.

Piz Daint’s rise has pushed the 17.6-petaflops U.S. Titan supercomputer down to fourth position, leaving the United States without a claim to any of the top three rankings. As the Top500 authors observe in today’s announcement, the only other time this has happened was in November 1996, when Japan dominated all three top spots.

For reference, the new top 10 rankings are reproduced below:

Source: Top500

This minimal list reshuffling led Top500 watcher and market analyst Addison Snell to comment, “With no changes in the Top 10 systems other than the Piz Daint upgrade, it may look like things aren’t moving forward, but this is the lull before a spate of new supercomputers that could hit the next list, particularly the two CORAL pre-exascale systems at U.S. national labs and the possibility of the Chinese Tienhe-2A upgrade.

“Some of the more interesting trends occur over the rest of the list population,” the CEO of Intersect360 Research continued. “For example, the number of manycore systems continues to rise, whether as accelerators or co-processors, which are mostly Nvidia GPUs, or with the Intel Xeon Phi as a standalone processor. This is driving related improvements in power efficiency, which is necessary in the run-up to exascale. It’s also notable to see Intel Omni-Path adoption continuing on the list. We are monitoring this in our end-user surveys to see how much penetration Omni-Path might have versus Ethernet and InfiniBand.”

When asked if he thought the CORAL systems, Summit and Sierra, would be ready next time this year, Snell said he wouldn’t be surprised if they come sooner than that. So SC17? We caught Snell in between flights but we’ll be asking him more about his thinking here during ISC. The U.S. has announced an exascale accelerated timeline (see our latest U.S. exascale coverage here) and promised additional monies to fund it, so a quickening for “pre-exascale” here makes sense if partners IBM, Nvidia and Mellanox can accommodate.

Then, as Snell also noted, there is still the matter of the Tianhe-2A system. The Tianhe-2 upgrade, which was to go forward with Feiteng processors after a U.S. embargo derailed the Knights Landing refresh, has not yet materialized. Signs now point to Tianhe-2A being NUDT’s exascale prototype, one of three exascale contenders in China (along with the Sugon and Wuxi Supercomputing Center efforts). It is speculated that the next Tianhe will employ the Feiteng FT-2000/64 that Phytium Technologies introduced at the 2016 Hotchips conference. The FT-2000/64 is a 64-core ARM processor with a stated 512 gigaflops peak performance at a frequency of 2.0 GHz in a 100 watt power envelope (max).

Splitting the Top500 Pie

While the U.S. has lost supremacy at the peak, it counts five systems within the top ten, still more than any other country. The U.S. leads total system share as well with 169 machines. China is a close second with 160. Recall the U.S. and China were tied with 171 systems each six months ago, but other countries have assumed some of that share, notably Japan and the UK. Japan is now third with 33 supercomputers up from 27 in November. Germany ranks fourth with 28, down from 31. France and the UK are tied for fifth with 17 systems each, with France dropping three systems and the UK adding four.

Shifting the perspective to aggregate performance share, maintains the ordering: U.S. (33.8 percent), China (32 percent), Japan (6.6 percent), Germany (5.6 percent), France (3.4 percent), United Kingdom (3.4).

Looking at the vendor landscape, Hewlett Packard Enterprise (HPE) asserts itself as the number one vendor by system volume with 143, picking up 25 systems in the SGI acquisition, finalized last November. Lenovo is second with 88 systems, followed by Cray (57 systems), Sugon (46) and IBM (27). On the previous list iteration, it was HPE (112 systems), Lenovo (92 systems), Cray (56 systems), Sugon (47) and IBM (with 33). There was only one new IBM system on today’s listing.

June 2017 Top500 vendor tree map (percent of total list performance)

When it comes to total list performance share, Cray maintains its lead at 21.4 percent, a skosh up from 21.3 percent six months back. Bolstered by its SGI acquisition, HPE comes back to a solid second place with 16.6 percent up 9.8 percent. With the strong showing of the combined HPE+SGI installs, Sunway TaihuLight developer NRCPC drops to third with 12.5 percent of the total installed performance (down from 13.8). Lenovo is next (9.3 percent, up from 8.8 percent), then IBM (7.5 percent, down from 8.8 percent).

The aggregate performance of all 500 computers on the 49th list stands at 749 petaflops, compared to 672 petaflops six months ago and 567 petaflops one year ago. This 32 percent annual growth rate is far below historical trends, which prior to 2008 averaged about 90 percent per year and more recently averaged around 55 percent per year. It’s a trend that shows no signs of reversal, according to the Top500 authors.

Source: Top500

The aggregate performance of the top ten machines is 235.9 petaflops up from 226 petaflops owed solely to the Piz Daint upgrade. 21 systems have joined the petaflops club, bringing total membership to 138 from 117 six months ago. The admission point for the TOP100 is currently 1.21 petaflops up from 1.07 petaflops. The bar for entry onto the list has been raised to 432.2 Linpack teraflops compared to 349.3 teraflops on the last list.

Other notable trends observed by Top500 authors:

  • Accelerator/Co-processor trends through June 2017 (Source: Top500)

    A total of 91 systems on the list are using accelerator/co-processor technology, up from 86 on November 2016.  71 of these use NVIDIA chips, 14 systems with Intel Xeon Phi technology (as Co-Processors), one uses ATI Radeon, and two are using PEZY technology. Three systems use a combination of Nvidia and Intel Xeon Phi accelerators/co-processors. An additional 13 Systems now use Xeon Phi as the main processing unit.

  • The average number of accelerator cores for these 91 systems is 115,000 cores/system.
  • Intel continues to provide the processors for the largest share (92.8 percent) of TOP500 systems.
  • Ninety-three (93.0) percent of the systems use processors with eight or more cores, sixty-eight (68.6) percent use twelve or more cores, and twenty-seven (27.2) percent twelve or more cores.
  • Gigabit Ethernet is now at 207 systems (unchanged), in large part thanks to 194 systems now using 10G interfaces. InfiniBand technology is now found on 178 systems, down from 187 systems, and is the second most-used internal system interconnect technology.
  • Intel Omni-Path technology which made its first appearance one year ago with 8 systems is now at 38 systems up from 28 system six month ago.

Also noteworthy, the Top500 list now incorporates the HPCG benchmark results “to provide a more balanced look at performance,” according to the list editors. They further report that “the fastest system on the HPCG benchmark is Fujitsu’s K computer which is ranked #8 in the overall Top500. It is followed closely by Tianhe-2 which is also No. 2 on the Top500.” This lineup is unchanged since the November HPCG ranking results.

Highlights from the Green500

The new list has an interesting tale to tell when it comes to energy efficiency metrics. Japan captured the top four spots of the Green500 with four new systems and the upgraded Swiss Piz Daint has the fifth spot. The fact that all five of these systems employ Tesla P100 GPUs speaks well for Nvidia, which also claims the seventh through fourteenth Green500 spots.

At the top of the green ranking, touting 14.110 gigaflops/watt, is the new TSUBAME 3.0, a modified HPC ICE XA machine, designed by Tokyo Tech and HPE. The system earned a 61st place spot on the TOP500 with a 1.998-petaflop Linpack run. The new Green500 record holder bests the previous record set by Nvidia’s internal Saturn V supercomputer six months ago (8.17 gigaflops/watt) by 72.7 percent.

The second-place Green500 system is “kukai,” built by Exascaler and installed at the Yahoo Japan Corporation. It achieves 14.045 gigaflops/watt, a mere 0.3 percent behind TSUBAME 3.0. It’s Top500 ranking is 466. Coming in at number three is the AIST AI Cloud system at the National Institute of Advanced Industrial Science and Technology, Japan. The NEC machine achieves 12.681 gigaflops/watt and is ranked number 148 on the Top500. The fourth place Green500 system is the Fujitsu-made RAIDEN GPU system, installed at RIKEN’s Center for Advanced Intelligence Project. It accomplished 10.603 gigaflops/watt and sits at number 306 on the Top500 line-up. The Dell Wilkes-2 machine installed at the University of Cambridge is in fifth place with 10.428 gigaflops/watt. Its Top500 ranking is 100.

Piz Daint, the sixth-ranked supercomputer on the Green500, achieved 10.398 gigaflops/watt. As a number three system, this is quite the accomplishment, as the latest energy-efficiency technologies don’t always scale well or make it to the top of the list due to long development cycles. The fact that Piz Daint is the most energy-efficient supercomputer within the top 50 fastest supercomputers speaks to that point.

In seventh position is “Gyoukou,” the Exascalar ZettaScaler-1.6 system at the Japan Agency for Marine-Earth Science and Technology with 10.226 gigaflops/watt. Relying on PEZY-SC2 accelerators, Gyoukou is the highest ranking non-GPU system on the Green500 list.

The TOP500 and Green500 awards will be presented by Top500 co-author Horst D. Simon, deputy director of Lawrence Berkeley National Laboratory, at 10:30 am today in Frankfurt. We expect lots more analysis to come out of the Top500 and Green500 program tracks. We will report back on these and other benchmarking results presented at ISC 2017. If you have any insights or comments to share, please catch me by email or in-person at the show.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

SC17 Student Cluster Competition Configurations: Fewer Nodes, Way More Accelerators

November 16, 2017

The final configurations for each of the SC17 “Donnybrook in Denver” Student Cluster Competition have been released. Fortunately, each team received their equipment shipments on time and undamaged, so the teams are r Read more…

By Dan Olds

Student Clusterers Demolish HPCG Record! Nanyang Sweeps Benchmarks

November 16, 2017

Nanyang pulled off the always difficult double-play at this year’s SC Student Cluster Competition. The plucky team from Singapore posted a world record LINPACK, thus taking the Highest LINPACK Award, but also managed t Read more…

By Dan Olds

Student Cluster LINPACK Record Shattered! More LINs Packed Than Ever before!

November 16, 2017

Nanyang Technological University, the pride of Singapore, utterly destroyed the Student Cluster Competition LINPACK record by posting a score of 51.77 TFlop/s at SC17 in Denver. The previous record, established by German Read more…

By Dan Olds

HPE Extreme Performance Solutions

Harness Scalable Petabyte Storage with HPE Apollo 4510 and HPE StoreEver

As a growing number of connected devices challenges IT departments to rapidly collect, manage, and store troves of data, organizations must adopt a new generation of IT to help them operate quickly and intelligently. Read more…

Hyperion Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue

November 15, 2017

The HPC market update from Hyperion Research (formerly IDC) at the annual SC conference is a business and social “must,” and this year’s presentation at SC17 played to a SRO crowd at a downtown Denver hotel. This w Read more…

By Doug Black

Student Cluster LINPACK Record Shattered! More LINs Packed Than Ever before!

November 16, 2017

Nanyang Technological University, the pride of Singapore, utterly destroyed the Student Cluster Competition LINPACK record by posting a score of 51.77 TFlop/s a Read more…

By Dan Olds

Hyperion Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue

November 15, 2017

The HPC market update from Hyperion Research (formerly IDC) at the annual SC conference is a business and social “must,” and this year’s presentation at S Read more…

By Doug Black

Nvidia Focuses Its Cloud Containers on HPC Applications

November 14, 2017

Having migrated its top-of-the-line datacenter GPU to the largest cloud vendors, Nvidia is touting its Volta architecture for a range of scientific computing ta Read more…

By George Leopold

HPE Launches ARM-based Apollo System for HPC, AI

November 14, 2017

HPE doubled down on its memory-driven computing vision while expanding its processor portfolio with the announcement yesterday of the company’s first ARM-base Read more…

By Doug Black

OpenACC Shines in Global Climate/Weather Codes

November 14, 2017

OpenACC, the directive-based parallel programming model used mostly for porting codes to GPUs for use on heterogeneous systems, came to SC17 touting impressive Read more…

By John Russell

Flipping the Flops and Reading the Top500 Tea Leaves

November 13, 2017

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released Read more…

By Tiffany Trader

Tensors Come of Age: Why the AI Revolution Will Help HPC

November 13, 2017

Thirty years ago, parallel computing was coming of age. A bitter battle began between stalwart vector computing supporters and advocates of various approaches to parallel computing. IBM skeptic Alan Karp, reacting to announcements of nCUBE’s 1024-microprocessor system and Thinking Machines’ 65,536-element array, made a public $100 wager that no one could get a parallel speedup of over 200 on real HPC workloads. Read more…

By John Gustafson & Lenore Mullin

AMD Showcases Growing Portfolio of EPYC and Radeon-based Systems at SC17

November 13, 2017

AMD’s charge back into HPC and the datacenter is on full display at SC17. Having launched the EPYC processor line in June along with its MI25 GPU the focus he Read more…

By John Russell

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Graphcore Readies Launch of 16nm Colossus-IPU Chip

July 20, 2017

A second $30 million funding round for U.K. AI chip developer Graphcore sets up the company to go to market with its “intelligent processing unit” (IPU) in Read more…

By Tiffany Trader

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

Leading Solution Providers

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

IBM Advances Web-based Quantum Programming

September 5, 2017

IBM Research is pairing its Jupyter-based Data Science Experience notebook environment with its cloud-based quantum computer, IBM Q, in hopes of encouraging a new class of entrepreneurial user to solve intractable problems that even exceed the capabilities of the best AI systems. Read more…

By Alex Woodie

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

  • arrow
  • Click Here for More Headlines
  • arrow
Share This