Back to the Future of Serial Speed?

By Bill Sembrat

October 30, 2013

For the last few decades we have had great increases in performance. Since going to “off-the-shelf components” and riding on the tails of increasing processor improvements along with ever greater number of chips and cores some have come to realize that this can’t go on forever.

We have filled up a number of ever-bigger rooms with racks and racks until we have come face to face with having to own your own power company just for the power needed to run these complexes. On another front I would have expected more concerns about the high cost of these high-end systems.

It also seems that others are raising doubts because recently there has been a lot questions, concerns, discussions, comments, articles about the problems, and enough concerns about the way forward to even provide extra funding, but few seem to looking at root issues and saying that moving forward things need to change.

We have been lucky to ride the train so far, much longer and further than those of us would have ever imaged. But, now it has become ever harder and more costly to keep the train going. So lets take a deeper look at the road traveled. About 20 years ago the HPC train switched paths from custom processors and custom systems to off-the-shelf processors and systems. It has been a good and fruitful ride.

I think it would be very wise to notice that recently most of the speed improvement has come from the parallel side. I (we) was always highly focused on the serial side. At this point maybe I should say something about myself. I was fortunate to have worked with Seymour Cray for many years. So, the we, I am referring to, is my involvement and experience with Seymour and Seymour’s machines. Seymour was never in any race and not really concerned about what someone else may be doing or not doing, but always interested in exploring and pushing serial speed on real workloads. We were mainly focused on serial speed because it kept things simple and made systems easier to use, easier to program, with less overhead and higher system efficiencies.

Few may have ever talked to Seymour about serial speed vs. parallel speed, but I can tell you that Seymour was always quite aware and disciplined himself to stay focused on serial speed improvements. He felt he could contribute more, add more value, was personally more challenging, and he very much like to work on, enjoyed working on serial improvements.

Although, he would never admit it, he also knew that he was the “king” of serial speed. Just a side comment, Seymour was also interested in exploring the far end of parallel processors and we had a running prototype parallel machine that had a design goal of 30,000,000 processors, but that is quite another story. Getting back to this topic we were really always highly focused on serial speed with the “Cray’s.” Over the last 20 odd years the current off-the-shelf path has relied on serial speed improvements but ever more increasingly on greater and greater parallel speed improvements. Parallel speed improvements has, naturally, associated with it higher overhead and power costs along with lower system efficiencies and now ever higher costs to get into the top of the list.

So to get large cost effective improvements I think that we now need to re-focus back to serial speed improvements. I believe that by addressing serial speed improvements that speed improvements of 50X+ can be achieved because we were addressing root level changes that could lead to these kinds of improvements. This quickly leads one to a startling conclusion that memory can’t keep up, does not work and becomes the big elephant in the room. So you really need to look at how memory is used and really the only way to see it is to wipe the slate clean and get rid of memory. In order to think about it you need first get rid of it and start again fresh. Very few may be up to the task of starting fresh with a blank sheet. This is a rather hard task and not as simple as one may think.

While Seymour always preferred blank paper pads with faint light blue lines and number 2 pencils, at a time, it seemed, everyone started using computers and in some cases even “Cray” Super Computers to design the “next” machine. Einstein never needed or used a computer for his theories and I would guess that Peter Higgs didn’t use one either. Giant leaps and great things seem to come from very simple root ideas. Also can-do-positive attitudes play a most, maybe the most important part, even over seemingly impossible tasks.

The memory model currently used is largely based on a 70-year-old model. Oh, if you can wake up the guys that came up with the model, that were in the farm house/barn in Princeton at the time, they would be quite amazed at the great strides and progress but in very short order they would be able to program today’s machines – so in some ways things really haven’t changed much. Other areas will need to be addressed and changed, but memory is the first and most looming problem. Because these changes are deeper root issues they should be hidden from users and even and from most of the vast layers of existing software. Funny you may think that this is new but most was tried and used years ago, but never commercialized and sometimes discarded because of lack of the-then-current available technology.

Well, yes I do believe that by addressing some deep root issues that over time large serial speed improvements can be achieved, but to use them you will quickly come to the several conclusions including that you must deal with new ways to see and use memory and all that this implies. To achieve very large improvements, I think, the focus needs to be on very several very fundamental and root changes and then apply all the parallel knowledge and improvements made over the last 20 years. Now here, I believe, may be a bigger problem. In the US we have been blessed with chip and system vendors that have been able to supply ever-increasing speeds and lots of chips and cores so we have been glued to that path but others may be unencumbered, highly motivated and more able to do something new and different.

Although they may operate under different set of rules and have additional other problems they do not have as much invested in existing ideas, enterprises, hard plant and equipment; and may be less locked in and may be more willing to change pathways. So I am concerned with our current shortsighted attitude and lack of “Americanism” in keeping the leadership local.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Spoiler Alert: Glimpse Next Week’s Solar Eclipse Via Simulation from TACC, SDSC, and NASA

August 17, 2017

Can’t wait to see next week’s solar eclipse? You can at least catch glimpses of what scientists expect it will look like. A team from Predictive Science Inc. (PSI), based in San Diego, working with Stampede2 at the Read more…

By John Russell

Dell EMC will Build OzStar – Swinburne’s New Supercomputer to Study Gravity

August 16, 2017

Dell EMC announced yesterday it is building a new supercomputer – the OzStar – for the Swinburne University of Technology (Australia) in support the ARC Centre of Excellence for Gravitational Wave Discovery (OzGrav) Read more…

By John Russell

Microsoft Bolsters Azure With Cloud HPC Deal

August 15, 2017

Microsoft has acquired cloud computing software vendor Cycle Computing in a move designed to bring orchestration tools along with high-end computing access capabilities to the cloud. Terms of the acquisition were not Read more…

By George Leopold

HPE Extreme Performance Solutions

Leveraging Deep Learning for Fraud Detection

Advancements in computing technologies and the expanding use of e-commerce platforms have dramatically increased the risk of fraud for financial services companies and their customers. Read more…

HPE Ships Supercomputer to Space Station, Final Destination Mars

August 14, 2017

With a manned mission to Mars on the horizon, the demand for space-based supercomputing is at hand. Today HPE and NASA sent the first off-the-shelf HPC system into space aboard the SpaceX Dragon Spacecraft to explore if Read more…

By Tiffany Trader

Microsoft Bolsters Azure With Cloud HPC Deal

August 15, 2017

Microsoft has acquired cloud computing software vendor Cycle Computing in a move designed to bring orchestration tools along with high-end computing access capa Read more…

By George Leopold

HPE Ships Supercomputer to Space Station, Final Destination Mars

August 14, 2017

With a manned mission to Mars on the horizon, the demand for space-based supercomputing is at hand. Today HPE and NASA sent the first off-the-shelf HPC system i Read more…

By Tiffany Trader

AMD EPYC Video Takes Aim at Intel’s Broadwell

August 14, 2017

Let the benchmarking begin. Last week, AMD posted a YouTube video in which one of its EPYC-based systems outperformed a ‘comparable’ Intel Broadwell-based s Read more…

By John Russell

Deep Learning Thrives in Cancer Moonshot

August 8, 2017

The U.S. War on Cancer, certainly a worthy cause, is a collection of programs stretching back more than 40 years and abiding under many banners. The latest is t Read more…

By John Russell

IBM Raises the Bar for Distributed Deep Learning

August 8, 2017

IBM is announcing today an enhancement to its PowerAI software platform aimed at facilitating the practical scaling of AI models on today’s fastest GPUs. Scal Read more…

By Tiffany Trader

IBM Storage Breakthrough Paves Way for 330TB Tape Cartridges

August 3, 2017

IBM announced yesterday a new record for magnetic tape storage that it says will keep tape storage density on a Moore's law-like path far into the next decade. Read more…

By Tiffany Trader

AMD Stuffs a Petaflops of Machine Intelligence into 20-Node Rack

August 1, 2017

With its Radeon “Vega” Instinct datacenter GPUs and EPYC “Naples” server chips entering the market this summer, AMD has positioned itself for a two-head Read more…

By Tiffany Trader

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Nvidia’s Mammoth Volta GPU Aims High for AI, HPC

May 10, 2017

At Nvidia's GPU Technology Conference (GTC17) in San Jose, Calif., this morning, CEO Jensen Huang announced the company's much-anticipated Volta architecture a Read more…

By Tiffany Trader

How ‘Knights Mill’ Gets Its Deep Learning Flops

June 22, 2017

Intel, the subject of much speculation regarding the delayed, rewritten or potentially canceled “Aurora” contract (the Argonne Lab part of the CORAL “ Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Quantum Bits: D-Wave and VW; Google Quantum Lab; IBM Expands Access

March 21, 2017

For a technology that’s usually characterized as far off and in a distant galaxy, quantum computing has been steadily picking up steam. Just how close real-wo Read more…

By John Russell

Russian Researchers Claim First Quantum-Safe Blockchain

May 25, 2017

The Russian Quantum Center today announced it has overcome the threat of quantum cryptography by creating the first quantum-safe blockchain, securing cryptocurrencies like Bitcoin, along with classified government communications and other sensitive digital transfers. Read more…

By Doug Black

HPC Compiler Company PathScale Seeks Life Raft

March 23, 2017

HPCwire has learned that HPC compiler company PathScale has fallen on difficult times and is asking the community for help or actively seeking a buyer for its a Read more…

By Tiffany Trader

Trump Budget Targets NIH, DOE, and EPA; No Mention of NSF

March 16, 2017

President Trump’s proposed U.S. fiscal 2018 budget issued today sharply cuts science spending while bolstering military spending as he promised during the cam Read more…

By John Russell

Leading Solution Providers

CPU-based Visualization Positions for Exascale Supercomputing

March 16, 2017

In this contributed perspective piece, Intel’s Jim Jeffers makes the case that CPU-based visualization is now widely adopted and as such is no longer a contrarian view, but is rather an exascale requirement. Read more…

By Jim Jeffers, Principal Engineer and Engineering Leader, Intel

Groq This: New AI Chips to Give GPUs a Run for Deep Learning Money

April 24, 2017

CPUs and GPUs, move over. Thanks to recent revelations surrounding Google’s new Tensor Processing Unit (TPU), the computing world appears to be on the cusp of Read more…

By Alex Woodie

Google Debuts TPU v2 and will Add to Google Cloud

May 25, 2017

Not long after stirring attention in the deep learning/AI community by revealing the details of its Tensor Processing Unit (TPU), Google last week announced the Read more…

By John Russell

MIT Mathematician Spins Up 220,000-Core Google Compute Cluster

April 21, 2017

On Thursday, Google announced that MIT math professor and computational number theorist Andrew V. Sutherland had set a record for the largest Google Compute Engine (GCE) job. Sutherland ran the massive mathematics workload on 220,000 GCE cores using preemptible virtual machine instances. Read more…

By Tiffany Trader

Six Exascale PathForward Vendors Selected; DoE Providing $258M

June 15, 2017

The much-anticipated PathForward awards for hardware R&D in support of the Exascale Computing Project were announced today with six vendors selected – AMD Read more…

By John Russell

Top500 Results: Latest List Trends and What’s in Store

June 19, 2017

Greetings from Frankfurt and the 2017 International Supercomputing Conference where the latest Top500 list has just been revealed. Although there were no major Read more…

By Tiffany Trader

IBM Clears Path to 5nm with Silicon Nanosheets

June 5, 2017

Two years since announcing the industry’s first 7nm node test chip, IBM and its research alliance partners GlobalFoundries and Samsung have developed a proces Read more…

By Tiffany Trader

Messina Update: The US Path to Exascale in 16 Slides

April 26, 2017

Paul Messina, director of the U.S. Exascale Computing Project, provided a wide-ranging review of ECP’s evolving plans last week at the HPC User Forum. Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Share This