Back to the Future

By By Tom Gibbs, Contributing Author

April 2, 2007

You may not need to induce Dr. Emmett Brown’s “flux capacitor” and hop a ride with Marty McFly to take a ride “Back to the Future” because the computing portion of Grid 2.0 is going through a metamorphosis that is just beginning to take shape, and it’s starting to look very familiar. A great philosopher once said, “Those who cannot remember the past are condemned to repeat it.” We may not exactly be doomed to a stroll down the “green mile” in this case, but it does look like we’ll be reliving some trends from computing’s storied past.

Over the past year, I’ve been focused on the communications and consumer usage models that are emerging with the current phase of the approach to deliver information technology commonly called “grid.” My observation about a year ago tied an overall shift in the way consumers and business were acquiring and using software to the approach for developing and deploying IT infrastructure that was being developed by the grid community. The new wave of software-as-a-service that is born and delivered over the Internet was, and is, being called out in headlines as “Web 2.0” after a term coined by Tim O’ Reilly a few years ago. I termed the application of a grid-based infrastructure for the emerging Web 2.0 application development and delivery model “Grid 2.0.”

The emergence of Web 2.0 and the underlying Grid 2.0 infrastructure have been taking off faster than the user community of the virtual reality game “Second Life.” While this has been happening, I’ve started to observe that the traditional computing and scientific usage models that are the corner stone of Grid 1.0 may be going through a quieter but perhaps just as fundamental transformation. The current changes have a striking resemblance to events that unfolded 40 years ago in the initial stages of the computer industry. I’m beginning to conclude that we may be coming full circle with respect to meeting the needs of the technical computing user group. While not quite as sanguine as the late, great philosopher Santayana, the former great catcher and manager for the New York Yankees may have summed this up best when he observed, “It’s like déjà vu all over again.”

If we did decide to join Marty McFly — jumped into the DeLorean, popped the clutch, kicked Dr. Emmett Brown’s flux Capacitor into gear and zoomed back to the early 1990s — we could choose to land in Illinois and meet up with Doctors Ian Foster, Charlie Catlett and Carl Kesselman to get a look at the beginning of the concept they would name “grid.” We’d meet with some true visionaries who were developing an approach to architectural virtualization on a grand scale, and we’d see a scientific computing industry in a state of massive transition. The computer systems that were used by nearly 100 percent of community doing large-scale scientific research and engineering over the past decade were based on custom processors or used attached custom processors to perform the numerical calculations. The shift that was happening was from custom processors to commercial-off-the-shelf processors. The executives in the custom processor industry would come to refer to this epoch as “the attack of the killer micros,” which was a pejorative reference to the title of a cult classic B-movie with a famously absurd plot line.

In this case, however, the plot line of the trend wasn’t absurd at all. It was based on the empirical observation made by Dr. Gordon Moore of Intel in the late 1960s (known as Moore’s Law), where every 18 months a combination of economics and science would allow the number of transistors that can be cost-effectively manufactured in a single die of silicon to double. When applied to general-purpose microprocessors, this doubling in feature density resulted in a doubling in raw performance because the closer you could get the transistors together, the faster you could run the processor — with the speed of light being the governing factor. Whether you apply this rate of improvement to your retirement portfolio or processor performance — or anything else for that matter — the growth is exponential. Hindsight is 20/20. Looking back now, it’s clear that the custom processor designers couldn’t keep up.

There was another aspect of the continual improvement in general-purpose microprocessor performance that might have been more important than bragging rights in raw performance: Software developers could just ride the exponential performance curve. Why tweak code to gain performance advantages when you could just wait a year and the processor vendor would give it to you for free? The software developer had to do some level of parallel processing to take full advantage of the “killer micros,” but they were already doing that anyway to get more performance out of the custom processors. The end of the story was certain; it was only a matter of time.

A proxy for this transition is the Top 500 list of computer systems, which went from being dominated by custom processor-based systems in the early ‘90s to being almost completely dominated by systems based on general-purpose microprocessors in the last few years. Game over! But before we declare clear and present victory, let’s take a peek at how the game began, and we may see that it might be more appropriate to declare “Inning over!”

We could choose to rescue McFly from the killer micros, hop back in the DeLorean and go back to the early 1970s, not long after Dr. Moore derived his eponymous law. Once the smoke cleared, we could expect to hear the reverend Al Green belt out his No. 1 hit “Let’s Stay Together” on the radio. We’d find that the leading computer designers of the time — Gene Amdahl and Seymour Cray — were deciding they didn’t want to live the lyrics and were leaving their respective and respected employers — IBM and Control Data Corporation — and setting out to build their own unique systems.

Cray, in particular, was frustrated with the lack of innovation in floating point performance and in a few years would bring the Cray 1 to market. It should be noted that this breakthrough design came to market with no compiler or operating system at about the same time as Ted Hoff at Intel developed the first microprocessor for a calculator company. Neither of them probably thought about it then, but a race was on.

Cray and the other custom designers toiled long and hard to develop a solution to a basic problem of the time: Technical computing users needed far more computing power than the general-purpose computers let alone microprocessors could deliver. The gap in performance and related demand for additional computing power was large enough to support multiple vendors and approaches with single-system prices in the range of $5 million to $10 million, which, if adjusted for inflation, translate to roughly $50 million to $100 million in current dollars.

The late ‘70s saw a wide variety of approaches to the basic performance problem. Some designers like Cray developed fully integrated systems. IBM developed a separate “vector” unit that could be plugged into one of its general-purpose mainframes, and Seymour Cray’s former employer finally bootstrapped itself out of financial trouble and brought its Cyber series of products to market. Other companies like Floating Point Systems developed array processors that plugged into an I/O slot on a general-purpose system such as IBM mainframes or Digital Equipment Corporation minicomputers.

If we kicked the DeLorean into gear and hit the brakes in the early 1980s, with longitude and latitude in northern New York state, we might be listening to Joe Cocker’s No.1 hit “Up Where We Belong,” which starts with the lyrics “Who knows what tomorrow brings?” If we trucked on over to Cornell University, we could meet with Dr. Ken Wilson, who had at least part of the answer. Wilson would win Nobel Prize in Physics in 1982, which was the first time the prize was awarded for theory supported entirely by computer simulation.

He would then become the head of the Cornell Theory Center, a facility comprised of a parallel configuration that combined IBM mainframes with vector facilities and array processors from Floating Point Systems, some of which had additional custom accelerators for certain matrix algebra calculations. The fact that some of the custom processors had additional accelerator cards that would speed up specific functions like matrix algebra was an indication of the innovation required to achieve speed. And you thought the flux capacitor under the hood of the DeLorean was a wild design worthy of Rube Goldberg. In the mid-’80s, these were the lengths to which one went in the name of computer-assisted science.

Oh, it was a heady time for computer hardware architects — which resulted in a giant headache for software developers. Each of the fully integrated systems, like the Cray and Control Data Cyber systems, had their own operating systems and compilers and libraries, which were always afterthoughts of the original design teams. The array processors were notoriously tricky to program, with unique library functions and thorny overhead issues as you moved on and off the I/O bus and dealt with different data formats. Because there was no common architecture for tools developers to build to, the state of the industry for software was clumsy at best. Yuck. Let’s get outta here!

If we hopped back into the DeLorean, and got off in mid-2001, we’d find most of the focus in the computer industry was on deflated stock valuation and excess inventories. It would be easy to relate to Usher’s No. 1 hit “U Got it Bad,” where the lyrics “Everything that used to matter don’t matter no more” would ring true across the entire computing industry. In the midst of this economic fog there was a quiet but growing concern from the dedicated scientists developing future microprocessor products, but the oft-feared end to Moore’s law wasn’t the issue; the scientific brain trust concluded we could drive higher transistor densities for at least another 10 to 15 years. Unfortunately, there was a more immediate issue that would be summed up by the theories of Dr. James Clerk Maxwell.

Unlike Dr. Emmett Brown, Dr. Maxwell was a real scientist, and unlike Moore’s Law, Dr. Maxwell’s equations and relations are theory based on mathematics immutable by time. At some point in time, however, Moore’s Law will cease to be a law. He and others in the field of semiconductor design are confronted with the reality that the laws of electromagnetism and thermodynamics won’t come to an end anytime soon. The processor designers were forecasting that they could cram more transistors closer together based on Moore’s Law and run them at higher frequencies just like they had been doing for the last 25 years and the processors would continue to go faster, but they’d also be about as hot as a star. And not a big star like Britney Spears, but a big star like the Sun.

The early warning signals came from power users on Wall Street, who were seeing their utility bills go through the roof and were starting to hit the megawatt limits in their respective computer rooms. Then the large Web-based service providers like Google started to put their servers in every nook and cranny with a power outlet. We’re talking about a literal garage shop! More recently, Google, Microsoft and Yahoo started buying plots of land on the Columbia River to take advantage of the power and potential cooling capability.

There was one obvious solution: Use the improved feature size to develop multiple processors on a single die process that would not run at such high frequencies but would deliver increased performance by running in parallel. The concept was given the name multi-core and the result would be as fantastic as light beer: it would taste about the same and have fewer calories. However, it didn’t take a connoisseur to uncover the nasty aftertaste, and the free ride for software was about to end.

Now, this didn’t mean that software developers would start checking in and out of rehab and shave their heads, but at a minimum they’d have to recompile their applications. This sometimes was required with generational changes in microprocessors, so it was OK at first blush. Unfortunately, this would only result in performance improvement for a relatively small set of applications. To apply more broadly, applications would need to be modified by threading or some other form of parallel control to run on the multiple cores simultaneously. In some cases, the applications didn’t have the right structure to benefit from parallel execution, and the net overall result was that even with some changes to the code bases, applications were not going to see exponential improvements in performance from Moore’s Law in the future. The industry was about to collectively sing “Oops, I Did It Again,” as they were going to come fast and furious to the brick wall of performance described by Amdahl’s Law.

Like Maxwell’s relations, Amdahl’s Law is an algorithm, and it won’t end anytime soon. In simple terms, it says that if you try and speed up an application with special processing, the amount of speed up is limited by the fraction of time the application spends executing in the special processor. For example, let’s say you were playing an Internet game and half the time (let’s say 50 milliseconds) the application was accessing the Internet and the other part of the time it was executing in a special gaming processor that was 10 times faster than the general-purpose processor. The performance you’d see would be 55 milliseconds. The theory says that even if the processor is infinitely faster, the speed up is, at best, 2x.

Well, it’s time to get out of the hunk of stainless steel that defined the look of the maligned DeLorean — whose parent company had its start in the same year the Cray 1 shipped its first system — with our feet firmly planted in the present to observe the that more things change, the more they stay the same. The leading purveyors of processors for PCs and servers have committed to multi-core processors. Custom processors in multiple forms, from IBM’s Cell to Graphics Processing Units to Application Specific Integrated Circuits and Field Programmable Gate Arrays, are all being announced and improved at a feverish clip. AMD just announced Torrenza, which will allow custom processors to be plugged directly into the hyper-transport interconnect. We can expect Intel to make a similar move soon to allow direct connection of custom CPUs in their platforms.

All of these innovations are workarounds to deliver continued improvements in computing power. They will each benefit from Moore’s Law for the next few years and see improvements in raw speed. They, and the user community, also will be subject to Amdahl’s Law and will drive software developers to try and achieve a reasonable fraction of the potential raw performance. All of this effort will be an attempt to keep performance improving year-over-year in the face of Maxwell’s relations.

What does all this mean for the grid community? I think the focus of provisioning computing assets for a computing grid will need to be extended from aggregation and utilization of multiple general-purpose processors to finding the computing facility on the network with the processing appropriate for the given workload. Workload mapping already is included in the provisioning model, and some users, such as Steve Yatko from Credit Suisse First Boston, have been very vocal in the need to pursue this aspect of grid for some time. With workload mapping, grid-provisioning software determines what kind of computing is required by the application based on data types and other context provided by the developer and/or analyzed by the provisioning software automatically, and then maps it to the right hardware on the network.

At the most basic level, the need to write applications in a way that is modular (service-oriented architecture) and allows each module to share multiple computing assets and run on the most efficient computing, network and storage devices (service-oriented infrastructure) will shift in focus from nice- to-have to mandatory. The grid will become more important than ever as the free lunch with serial performance gradually slows down and eventually comes to an end.

About Tom Gibbs

Tom Gibbs is managing partner at Vx Ventures, a global consulting and investment partnership that focuses on the application of new IT architectures, such as grid computing, service-oriented architecure, RFID and sensor networks, to help communities and companies accelerate economic growth and improve the social well-being of their employees and citizens. Prior to Vx Ventures, Tom was the director of worldwide strategy and planning in the solutions market development group at the Intel Corporation, where he was responsible for developing global industry marketing strategies and building cooperative market development and marketing campaigns with Intel’s partners worldwide. He is a graduate in electrical engineering from California Polytechnic University in San Luis Obispo and was a member of the graduate fellowship program at Hughes Aircraft Company, where his areas of study included non-linear control systems, artificial intelligence and stochastic processes. He also previously served on the President’s Information Technology Advisory Council for open source computing.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion XL — were added to the benchmark suite as MLPerf continues Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing power it brings to artificial intelligence.  Nvidia's DGX Read more…

Call for Participation in Workshop on Potential NSF CISE Quantum Initiative

March 26, 2024

Editor’s Note: Next month there will be a workshop to discuss what a quantum initiative led by NSF’s Computer, Information Science and Engineering (CISE) directorate could entail. The details are posted below in a Ca Read more…

Waseda U. Researchers Reports New Quantum Algorithm for Speeding Optimization

March 25, 2024

Optimization problems cover a wide range of applications and are often cited as good candidates for quantum computing. However, the execution time for constrained combinatorial optimization applications on quantum device Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at the network layer threatens to make bigger and brawnier pro Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HBM3E memory as well as the the ability to train 1 trillion pa Read more…

MLPerf Inference 4.0 Results Showcase GenAI; Nvidia Still Dominates

March 28, 2024

There were no startling surprises in the latest MLPerf Inference benchmark (4.0) results released yesterday. Two new workloads — Llama 2 and Stable Diffusion Read more…

Q&A with Nvidia’s Chief of DGX Systems on the DGX-GB200 Rack-scale System

March 27, 2024

Pictures of Nvidia's new flagship mega-server, the DGX GB200, on the GTC show floor got favorable reactions on social media for the sheer amount of computing po Read more…

NVLink: Faster Interconnects and Switches to Help Relieve Data Bottlenecks

March 25, 2024

Nvidia’s new Blackwell architecture may have stolen the show this week at the GPU Technology Conference in San Jose, California. But an emerging bottleneck at Read more…

Who is David Blackwell?

March 22, 2024

During GTC24, co-founder and president of NVIDIA Jensen Huang unveiled the Blackwell GPU. This GPU itself is heavily optimized for AI work, boasting 192GB of HB Read more…

Nvidia Looks to Accelerate GenAI Adoption with NIM

March 19, 2024

Today at the GPU Technology Conference, Nvidia launched a new offering aimed at helping customers quickly deploy their generative AI applications in a secure, s Read more…

The Generative AI Future Is Now, Nvidia’s Huang Says

March 19, 2024

We are in the early days of a transformative shift in how business gets done thanks to the advent of generative AI, according to Nvidia CEO and cofounder Jensen Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Nvidia Showcases Quantum Cloud, Expanding Quantum Portfolio at GTC24

March 18, 2024

Nvidia’s barrage of quantum news at GTC24 this week includes new products, signature collaborations, and a new Nvidia Quantum Cloud for quantum developers. Wh Read more…

Alibaba Shuts Down its Quantum Computing Effort

November 30, 2023

In case you missed it, China’s e-commerce giant Alibaba has shut down its quantum computing research effort. It’s not entirely clear what drove the change. Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Leading Solution Providers

Contributors

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

Google Introduces ‘Hypercomputer’ to Its AI Infrastructure

December 11, 2023

Google ran out of monikers to describe its new AI system released on December 7. Supercomputer perhaps wasn't an apt description, so it settled on Hypercomputer Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Intel Won’t Have a Xeon Max Chip with New Emerald Rapids CPU

December 14, 2023

As expected, Intel officially announced its 5th generation Xeon server chips codenamed Emerald Rapids at an event in New York City, where the focus was really o Read more…

IBM Quantum Summit: Two New QPUs, Upgraded Qiskit, 10-year Roadmap and More

December 4, 2023

IBM kicks off its annual Quantum Summit today and will announce a broad range of advances including its much-anticipated 1121-qubit Condor QPU, a smaller 133-qu Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire