TSUBAME Prototype System Balances Benchmark Leadership

By Nicole Hemsoth

July 9, 2014

When it comes to large-scale supercomputing installations, Asia is the continent to watch carefully over the next few years. Already host to the top system in the world, China’s Tianhe-2, Japan and others have ambitions to take over the top ten list of systems.

In the most recent rankings, Japan is home to 30 systems out of the Top 500 worldwide supercomputer share, up from just 18 systems in 2010. In addition to the #4 K Computer system at RIKEN, another noteworthy machine, the #14-ranked TSUBAME 2.5 (or TSUBAME KFC, named for the Kepler Fluid Cooling component, which ironically means it’s submerged in oil) is garnering attention. Part of why this 76,032-core system is one to watch is because it’s setting the stage for a new machine with far higher performance—and some novel integrations of unique storage, network, middleware and other technology.

TSUBAMEKFCsmallThis month at ISC, Satoshi Matsuoka from the Tokyo Institute of Technology presented an overview of progress with the TSUBAME KFC machine, which is the precursor and prototype for the next-generation system of the same name that the team will roll out sometime in 2016. When it emerges in 2016, TSUBAME 3 is expected to hit the 25-30 petaflop performance mark while balancing some new technologies cooked into the middleware, storage, and network. In addition to the “Kepler Fluid Cooling” which was at the heart of its top efficiency rankings on the Green 500 this summer, where it was the top system.

“Assuming that TSUBAME-KFC’s energy efficiency could be scaled linearly to an exaflop supercomputing system, one that can perform one trillion floating-point operations per second, such a system would consume on the order of 225 megawatts (MW),“ said Wu Feng of the Green 500. “Although this 225-megawatt power envelope is still quite far from DARPA’s optimistic target of a 67-megawatt power envelope, it is an order of magnitude better than the initial projection of a nearly 3000-megawatt power envelope from 2007 when the first official Green500 list was launched.”

But the Green 500 and prototype system’s placement in the Top 500 are just part of a larger story–one that Masuoka doesn’t want the community to overlook. It’s about handling the next generation of data-intensive applications, which is an area full of lessons from outside of supercomputing.

The focus of TSUBAME-3 (and leading into 4 in the 2020-20222 timeframe) will be on balancing efficiency, data-readiness, and of course performance or, as Matsuoka described in his talk, a convergence of supercomputing with extreme big data. We are all aware of the bubble big data has presented in HPC, but Mastuoka says it’s critical to design systems that integrate lessons learned from hyperscale cloud datacenters as well as what appears ahead for eventual exascale-class systems.


The current KFC machine ranked #12 on the Graph 500, and #6 on the Green Graph 500, which looks at the energy efficiency of solving “big data” graph problems. This is where the real future focus of the system in its 2016 incarnation will be, says Matsuoka. As he explained, at the beginning of the Graph 500 list, the expectation among some was that the list would look far different than the Top 500 with a number of cloud vendors submitting their distributed machines for the rankings. However, the list looks quite similar to the Top 500, with the same machines at the top of the list that are in the Top 500 and to a lesser extent, the Green 500. The hope is to balance top results across these categories with an eye on real-world applications, not just benchmark toppling.

These early predictions stood to reason since ostensibly, the big clouds were tackling “big data” jobs The common estimate is that a giant web services company like Amazon has around 500,000 nodes with around 6 million cores spread throughout its network. That makes for a massive distributed machine, but the core counts of these cheaper servers are often far lower than ultra-dense supercomputers. For instance, Tianhe-2 has 3 million cores spread across 18,000 nodes. Matsuooka says this point isn’t a surprising one—large datacenters are common, but they tend to be very sparse; they don’t require the networking and density of supercomputers—and therefore don’t have the same capability.

The goal of the next incarnation of TSUBAME in 2016 will be reducing the size of the system while supplying the needed bandwidth and compute horsepower in a much smaller amount of space. Cheap SSDs, ultra-dense system design, and leveraging new uses of burst buffer technology to offload critical processing tasks are key to the approach with both the coing TSUBAME 3 and the future 4 machine.

More text here

TSUBAMEKFCMatsuoka says that TSUBAME 3 will feature larger capacity SSDs that will give the Tokyo Tech team local bandwidth of about 50 TB of capacity, or 50 GB/s bandwidth in a single rack, suppose 40 racks, several terabytes per second of aggregate bandwidth. They’re working with DDN now to further this future.

The Top 500 list in 2016 is set to be an interesting one, particularly in November, with the addition of this next-generation machine and several others we’ve heard word of. While not all the major machines set to come online by Linpack time will be running the famous benchmark since it doesn’t adequately reflect their goals, Japan is expected to take advantage of all three major benchmarking opportunities–Top 500, Green 500, and Graph 500–to show the balanced system they’re seeking…one that’s ultra-efficient, big data capable, and of course, high performing in a top 10-class way.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Geospatial Data Research Leverages GPUs

August 17, 2017

MapD Technologies, the GPU-accelerated database specialist, said it is working with university researchers on leveraging graphics processors to advance geospatial analytics. The San Francisco-based company is collabor Read more…

By George Leopold

Intel, NERSC and University Partners Launch New Big Data Center

August 17, 2017

A collaboration between the Department of Energy’s National Energy Research Scientific Computing Center (NERSC), Intel and five Intel Parallel Computing Centers (IPCCs) has resulted in a new Big Data Center (BDC) that Read more…

By Linda Barney

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last week the cloud giant released deeplearn.js as part of that in Read more…

By John Russell

HPE Extreme Performance Solutions

Leveraging Deep Learning for Fraud Detection

Advancements in computing technologies and the expanding use of e-commerce platforms have dramatically increased the risk of fraud for financial services companies and their customers. Read more…

Spoiler Alert: Glimpse Next Week’s Solar Eclipse Via Simulation from TACC, SDSC, and NASA

August 17, 2017

Can’t wait to see next week’s solar eclipse? You can at least catch glimpses of what scientists expect it will look like. A team from Predictive Science Inc. (PSI), based in San Diego, working with Stampede2 at the Read more…

By John Russell

Microsoft Bolsters Azure With Cloud HPC Deal

August 15, 2017

Microsoft has acquired cloud computing software vendor Cycle Computing in a move designed to bring orchestration tools along with high-end computing access capabilities to the cloud. Terms of the acquisition were not disclosed. Read more…

By George Leopold

HPE Ships Supercomputer to Space Station, Final Destination Mars

August 14, 2017

With a manned mission to Mars on the horizon, the demand for space-based supercomputing is at hand. Today HPE and NASA sent the first off-the-shelf HPC system i Read more…

By Tiffany Trader

AMD EPYC Video Takes Aim at Intel’s Broadwell

August 14, 2017

Let the benchmarking begin. Last week, AMD posted a YouTube video in which one of its EPYC-based systems outperformed a ‘comparable’ Intel Broadwell-based s Read more…

By John Russell

Deep Learning Thrives in Cancer Moonshot

August 8, 2017

The U.S. War on Cancer, certainly a worthy cause, is a collection of programs stretching back more than 40 years and abiding under many banners. The latest is t Read more…

By John Russell

IBM Raises the Bar for Distributed Deep Learning

August 8, 2017

IBM is announcing today an enhancement to its PowerAI software platform aimed at facilitating the practical scaling of AI models on today’s fastest GPUs. Scal Read more…

By Tiffany Trader

IBM Storage Breakthrough Paves Way for 330TB Tape Cartridges

August 3, 2017

IBM announced yesterday a new record for magnetic tape storage that it says will keep tape storage density on a Moore's law-like path far into the next decade. Read more…

By Tiffany Trader

AMD Stuffs a Petaflops of Machine Intelligence into 20-Node Rack

August 1, 2017

With its Radeon “Vega” Instinct datacenter GPUs and EPYC “Naples” server chips entering the market this summer, AMD has positioned itself for a two-head Read more…

By Tiffany Trader

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Leading Solution Providers

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

Six Exascale PathForward Vendors Selected; DoE Providing $258M

June 15, 2017

The much-anticipated PathForward awards for hardware R&D in support of the Exascale Computing Project were announced today with six vendors selected – AMD Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major Read more…

By Tiffany Trader

IBM Clears Path to 5nm with Silicon Nanosheets

June 5, 2017

Two years since announcing the industry’s first 7nm node test chip, IBM and its research alliance partners GlobalFoundries and Samsung have developed a proces Read more…

By Tiffany Trader

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Messina Update: The US Path to Exascale in 16 Slides

April 26, 2017

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Share This