AMD : The Integration Revolution?

By Nebojsa Novakovic

May 7, 2012

By Nebojsa Novakovic

The potential–and challenges of multi-core processing

In recent years, microprocessor designers began hitting the limitations of the single-core architecture. So they made the shift to power-efficient multi-core designs. Now they’re running up against the limitations of this format, as programming multi-core processors becomes increasingly complex. One path forward is to synergistically combine the potential of both CPU and GPU units. But integrating two very high powered processing units with somewhat differing performance and bandwidth requirements can pose interesting challenges to the overall system architecture.

AMD is proposing to solve these issues with a combination of hardware solutions and system-level programming tools, ushering in a new era of what has become known as heterogeneous computing.

AMD Fig. 1 

The promise of Heterogeneous System Architecture (HSA)

While it has been recognized for some time that GPUs can be used to do parallel processing, the programmer’s task has been difficult if not extraordinary. That’s where AMD’s Heterogeneous System Architecture (HSA) comes in. HSA enables a new way to program applications using the GPU that can make it easier for mainstream programmers. AMD’s HSA is a full solution approach, enabling mainstream programmers to write parallel processing code as easily for the GPU as the CPU. And in some cases, the code may be able to execute on either the CPU or GPU, based on the system’s resources.

One way HSA can help solve the problem is by providing a unified address space for the CPU and GPU. With HSA, GPUs support the same page tables x86 CPUs use for mapping program memory pages to physical memory. Now GPUs can use a much larger memory map and, more importantly, a pointer is usually the same for code running on the CPU and code running on the GPU. The latter allows one copy of data to exist in memory and both the CPU and GPU can act upon it. The programmer doesn’t have to manage two or more copies of the same data. This design also helps improve performance because it is no longer necessary to make copies and keep them synchronized.

AMD is proposing an open platform architecture for HSA with published specifications. HSA will have a virtual ISA known as HSAIL (HSA Intermediate Language), a memory model, and a system specification. AMD is working with hardware, operating system, tools, and application companies to form an HSA foundation to guide the architectural development into the future.

Among the programmers I talk to, there is a great deal of excitement in being able to obtain that huge untapped performance potential from GPUs via HSA in an easy and transparent way, something that was difficult, if not impossible, until now

Combining the CPU and GPU: Bringing the APU to life

A hardware merger of the CPU and GPU, AMD’s Accelerated Processing Units (APUs) provide streamlined hardware-level integration of these two processing units. By the time of the AMD Fusion12 Developer Summit (June 11-14, 2012), AMD will have introduced “Trinity,” a second generation APU in market, along with the two first generation products AMD A-Series and AMD C and E-Series APUs formerly codenamed “Llano,” and “Brazos.”

The “Brazos”-based AMD C and E-Series APU combines an ultra-low power dual core CPU with an entry-level AMD Radeon™ GPU. Variants of these APUs are targeted at tablets, fanless notebooks, entry-level notebooks, and entry-level desktops. The “Llano”-based AMD A-Series APU combines two or four “Husky” CPU cores with a mid-range, AMD Radeon™ HD 6500-series discrete-class GPU. Husky cores are the next generation of the cores from the popular AMD Phenom™ processor series of AMD CPUs. The “Trinity”-based AMD A-Series variants target mainstream notebooks and mainstream desktops with good CPU performance and industry leading integrated graphics and video capabilities.

All three AMD APU families benefit from the greatly increased speed of communications between the CPU and GPU. Both the bandwidth and the latency for response are improved. All support DDR3 memory, DirectX 11 graphics, and have dedicated hardware for video playback. AMD C and E-Series has a single memory channel while AMD A-Series for both Llano and Trinity have dual memory channel support. To fully handle the memory bandwidth requirements of the larger GPU units in the A-Series of Llano and Trinity, DDR3 memory speeds up to 1600 and 1866 are supported, respectively.

The APU impact on system design: board, memory, graphics, and form factor

For a mainstream PC solution, the APU enables you to fit a quad core DirectX 11 3D gaming system with all the familiar features into a form-factor smaller than a Mini-ITX. Even if you add PCIe® expansion slots, there is still plenty of space for a Mini-ITX format solution; even a Pico-ITX form factor should be possible.

This opens up a design choice: One option is to stick with the default spec. The result will be smaller than most current TV set top boxes with near zero noise. The other choice is to stick with the Mini-ITX format, but provide performance enhancements through the use of a better cooling solution (to enable overclocking), adding faster memory options, and adding AMD Radeon™ Dual Graphics.

There were numerous online website reviews of the first generation APU desktop platforms discussing the various memory choices. Based on the current product performance and return on the memory speed investment, DDR3-1866 CL9 DIMMs are the best memory choice providing an outstanding performance per dollar. For the second generation APU, a DDR3-2133 CL10 or better memory should strike a good balance.

In summary, the AMD APU’s integration, balanced performance, expandability, and power usage enables new ultra-compact form factors for complete systems. The power consumption savings that can come from running a near teraflop of performance with less than 100W power can enable much “greener” high-end machines–up to the supercomputer range.

The APU impact in tablet and netbook space

The lowest power version of the AMD E-Series APU is suitable for an x86 tablet, which can run Windows 8 and avoid the dependence on application stores and such centralized resources. The added CPU and graphics horsepower of the APU enables new, productive form factors for tablets. How about a 11” or 12” inch full HD+ tablet with 16:10, 1920×1200 or even 3:2, 1920×1280 screen? Not only are these far more productive than movie screen 16:9 displays, but oriented in portrait mode they can emulate a printed page. Future ultrathin versions of quad-core AMD APUs in newer processes are designed to enable ultrahigh resolution 3D tablets that can also substitute for a proper PC.

Taking the APU further

A powerful GPU closely tied to the CPU not only benefits 3D graphics applications, but also applications with intense parallel computation. Examples include ultrafast large spreadsheet calculation, database manipulation, and media creation. On the larger scale, an APU-powered petaflop machine spread could enable more affordable supercomputing and large data analysis for many more users. An APU-based supercomputer could achieve its floating-point performance rating at one-third the power of the usual purely x86 CPU-based one, a massive advantage when combined with the right software.

Looking forward

The AMD APU and HSA approaches are revolutionary to programmers and users. AMD architecture changes not just the PC processor architecture, but the system design. That may not be obvious when looking at the first generation APUs. However, as AMD further develops APUs the benefit is bound to become obvious.

The HSA programming model will open new worlds of opportunities for the programmers, challenging them to harness the new performance potential. The more intricate interdependencies and benefits from that integration also will require system builders and designers to put more thought into maximizing their systems’ competitiveness. Over time, the benefits of the APU should spread top to bottom, from the supercomputer to the smartphone.

About the author: Based in Singapore, Nebojsa Novakovic is a strategic advisor to VR-Zone.com, Asia Pacific editor for TheInquirer.net, and frequently writes on high-end computing, system architecture, processors, 3-D graphics and related subjects.

Ready to learn more? AMD’s Fusion12 Developer Summit unites the industry’s experts in the world of heterogeneous computing. Held June 11-14, 2012, in Bellevue, Washington, the event provides deep, actionable content across ten tracks, covering heterogeneous computing as it relates to multimedia, graphics, cloud computing, security, big data, and more. Whether you’re responsible for planning or development, you’ll find the tools, knowledge, and resources you need to take advantage of this new era of computing. Learn more at amd.com/afds.

 

 

This paper is sponsored by AMD
© 2012 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD Arrow logo, and combinations thereof are trademarks of Advanced Micro Devices, Inc. Other names are for identification purposes only and may be trademarks of their respective owners.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

Like Nvidia, Google’s Moat Draws Interest from DOJ

October 14, 2024

A "moat" is a common term associated with Nvidia and its proprietary products that lock customers into their hardware and software. Another moat breakdown should have them concerned. The U.S. Department of Justice is Read more…

Recipe for Scaling: ARQUIN Framework for Simulating a Distributed Quantum Computing System

October 14, 2024

One of the most difficult problems with quantum computing relates to increasing the size of the quantum computer. Researchers globally are seeking to solve this “challenge of scale.” To bring quantum scaling closer Read more…

Nvidia Is Increasingly the Secret Sauce in AI Deployments, But You Still Need Experience

October 14, 2024

I’ve been through a number of briefings from different vendors from IBM to HP, and there is one constant: they are all leaning heavily on Nvidia for their AI services strategy. That may be a best practice, but Nvidia d Read more…

Zapata Computing, Early Quantum-AI Software Specialist, Ceases Operations

October 14, 2024

Zapata Computing, which was founded in 2017 as a Harvard spinout specializing in quantum software and later pivoted to an AI focus, is ceasing operations, according to an SEC filing last week. Zapata had gone public one Read more…

AMD Announces Flurry of New Chips

October 10, 2024

AMD today announced several new chips including its newest Instinct GPU — the MI325X — as it chases Nvidia. Other new devices announced at the company event in San Francisco included the 5th Gen AMD EPYC processors, Read more…

NSF Grants $107,600 to English Professors to Research Aurora Supercomputer

October 9, 2024

The National Science Foundation has granted $107,600 to English professors at US universities to unearth the mysteries of the Aurora supercomputer. The two-year grant recipients will write up what the Aurora supercompute Read more…

Nvidia Is Increasingly the Secret Sauce in AI Deployments, But You Still Need Experience

October 14, 2024

I’ve been through a number of briefings from different vendors from IBM to HP, and there is one constant: they are all leaning heavily on Nvidia for their AI Read more…

NSF Grants $107,600 to English Professors to Research Aurora Supercomputer

October 9, 2024

The National Science Foundation has granted $107,600 to English professors at US universities to unearth the mysteries of the Aurora supercomputer. The two-year Read more…

VAST Looks Inward, Outward for An AI Edge

October 9, 2024

There’s no single best way to respond to the explosion of data and AI. Sometimes you need to bring everything into your own unified platform. Other times, you Read more…

Google Reports Progress on Quantum Devices beyond Supercomputer Capability

October 9, 2024

A Google-led team of researchers has presented more evidence that it’s possible to run productive circuits on today’s near-term intermediate scale quantum d Read more…

At 50, Foxconn Celebrates Graduation from Connectors to AI Supercomputing

October 8, 2024

Foxconn is celebrating its 50th birthday this year. It started by making connectors, then moved to systems, and now, a supercomputer. The company announced it w Read more…

The New MLPerf Storage Benchmark Runs Without ML Accelerators

October 3, 2024

MLCommons is known for its independent Machine Learning (ML) benchmarks. These benchmarks have focused on mathematical ML operations and accelerators (e.g., Nvi Read more…

DataPelago Unveils Universal Engine to Unite Big Data, Advanced Analytics, HPC, and AI Workloads

October 3, 2024

DataPelago this week emerged from stealth with a new virtualization layer that it says will allow users to move AI, data analytics, and ETL workloads to whateve Read more…

Stayin’ Alive: Intel’s Falcon Shores GPU Will Survive Restructuring

October 2, 2024

Intel's upcoming Falcon Shores GPU will survive the brutal cost-cutting measures as part of its "next phase of transformation." An Intel spokeswoman confirmed t Read more…

Shutterstock_2176157037

Intel’s Falcon Shores Future Looks Bleak as It Concedes AI Training to GPU Rivals

September 17, 2024

Intel's Falcon Shores future looks bleak as it concedes AI training to GPU rivals On Monday, Intel sent a letter to employees detailing its comeback plan after Read more…

Granite Rapids HPC Benchmarks: I’m Thinking Intel Is Back (Updated)

September 25, 2024

Waiting is the hardest part. In the fall of 2023, HPCwire wrote about the new diverging Xeon processor strategy from Intel. Instead of a on-size-fits all approa Read more…

Ansys Fluent® Adds AMD Instinct™ MI200 and MI300 Acceleration to Power CFD Simulations

September 23, 2024

Ansys Fluent® is well-known in the commercial computational fluid dynamics (CFD) space and is praised for its versatility as a general-purpose solver. Its impr Read more…

AMD Clears Up Messy GPU Roadmap, Upgrades Chips Annually

June 3, 2024

In the world of AI, there's a desperate search for an alternative to Nvidia's GPUs, and AMD is stepping up to the plate. AMD detailed its updated GPU roadmap, w Read more…

Nvidia Shipped 3.76 Million Data-center GPUs in 2023, According to Study

June 10, 2024

Nvidia had an explosive 2023 in data-center GPU shipments, which totaled roughly 3.76 million units, according to a study conducted by semiconductor analyst fir Read more…

Shutterstock_1687123447

Nvidia Economics: Make $5-$7 for Every $1 Spent on GPUs

June 30, 2024

Nvidia is saying that companies could make $5 to $7 for every $1 invested in GPUs over a four-year period. Customers are investing billions in new Nvidia hardwa Read more…

Shutterstock 1024337068

Researchers Benchmark Nvidia’s GH200 Supercomputing Chips

September 4, 2024

Nvidia is putting its GH200 chips in European supercomputers, and researchers are getting their hands on those systems and releasing research papers with perfor Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Leading Solution Providers

Contributors

IBM Develops New Quantum Benchmarking Tool — Benchpress

September 26, 2024

Benchmarking is an important topic in quantum computing. There’s consensus it’s needed but opinions vary widely on how to go about it. Last week, IBM introd Read more…

Intel Customizing Granite Rapids Server Chips for Nvidia GPUs

September 25, 2024

Intel is now customizing its latest Xeon 6 server chips for use with Nvidia's GPUs that dominate the AI landscape. The chipmaker's new Xeon 6 chips, also called Read more…

Quantum and AI: Navigating the Resource Challenge

September 18, 2024

Rapid advancements in quantum computing are bringing a new era of technological possibilities. However, as quantum technology progresses, there are growing conc Read more…

IonQ Plots Path to Commercial (Quantum) Advantage

July 2, 2024

IonQ, the trapped ion quantum computing specialist, delivered a progress report last week firming up 2024/25 product goals and reviewing its technology roadmap. Read more…

Google’s DataGemma Tackles AI Hallucination

September 18, 2024

The rapid evolution of large language models (LLMs) has fueled significant advancement in AI, enabling these systems to analyze text, generate summaries, sugges Read more…

Microsoft, Quantinuum Use Hybrid Workflow to Simulate Catalyst

September 13, 2024

Microsoft and Quantinuum reported the ability to create 12 logical qubits on Quantinuum's H2 trapped ion system this week and also reported using two logical qu Read more…

US Implements Controls on Quantum Computing and other Technologies

September 27, 2024

Yesterday the Commerce Department announced export controls on quantum computing technologies as well as new controls for advanced semiconductors and additive Read more…

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium

May 30, 2024

Consider the GPU. An island of SIMD greatness that makes light work of matrix math. Originally designed to rapidly paint dots on a computer monitor, it was then Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire