SGI Expanding the Reach of Linux

By Nicole Hemsoth

October 6, 2006

Steve Neuner, the director for Linux engineering at SGI, has been pushing Linux up the scalability ladder for the better part of the 21st century. In August of this year, SGI announced that they were able to run a single system image of the Linux OS over 1024 processors on an Itanium-based Altix 4700 supercomputer. How was this feat accomplished? This week at the Gelato Itanium Conference and Expo (ICE) in Singapore, Neuner presented a session that described the Linux kernel modification that helped to make this possible. HPCwire caught up with him before the conference to ask him about the Linux improvements and where the future of single system image scalability is headed.

HPCwire: Can you give us a brief time line of how Linux has scaled from 8 processors to 1024 processors over the last five years?

Neuner: In the summer of 2001, we built an early 32 processor prototype system in the lab. SGI used it extensively to begin identifying and fixing scaling issues. This development system was later increased to 64 processors, which became our initial configuration limit for a single system image of the Linux kernel when we launched SGI Altix in February of 2003. A year later, that limit was increased to 256 processors.

Later in February of 2005, we started shipping the 2.6 Linux kernel, which was a major step forward that enabled support for 512 processor systems. In August of this year, this limit was increased to our now current limit of 1024 processors.

HPCwire: Can you describe the types of changes that were made to the Linux 2.6 kernel to get a single image of the OS to run on a 1024-processor system?

Neuner: The changes usually fall into one of two categories. The first is getting the system to boot and recognize all the hardware. This typically involves increasing the size of data structures throughout the kernel that contain information related to the amount of nodes, processors, or memory on a NUMA system. SGI uses a hardware simulator to find and fix most of these problems before we have a system of that size in the lab. For example, when engineering received the first 1024 processor system for testing, it booted right up the very first time.

Once Linux can boot and run on a larger system, the next category of fixes is getting Linux to perform well. This work often involves running benchmark tests and various HPC applications, so hot-locks, cache lines, timing windows, and race conditions can be exposed and pin-pointed in order to improve Linux's efficiency on very large systems.

Surprisingly, most of the changes going from 512 processors to 1024 processors fell into the first category of enabling the kernel to recognize and boot on a 1024 processor system. It turned out that the performance scaling work done earlier with our 512p system paid off since issues were already found and fixed. So going from 512p to 1024p became more of a testing and validation exercise. As a result, we were able to officially support 1024 processors for our customers a year ahead of plan.

HPCwire: Can you talk about some of the other 2.6 Linux kernel enhancements that have been added for HPC functionality?

Neuner: As processor counts increase, so does memory. Significant improvements in 2.6 were made in memory handling and supporting larger memory sizes. Some examples in this area include support for over 10 TB of memory, improved node locality and NUMA awareness in various kernel memory allocations mechanisms, 4-level page table, page migration, out-of-memory error handling improvements, and fault containment of double-bit uncorrectable memory errors.

Process scheduling is another area that has seen significant advances. Some examples include the O(1) scheduler, which maintains an almost constant level of system overhead regardless of the system size; CPU affinity support for placement of processes on specific processors; CPUSETS, which allow a user to place specific processors and reserve local memory for exclusive use; and dynamic scheduling domains.

Other areas of improvement include the incorporation of XFS for high bandwidth and large file systems, support for a large number of disks, an overhaul of the block and driver layer to enable large and parallel I/Os, high performance networking with 10 Gigabit Ethernet and InfiniBand, timer resolution and the new thread library.

All these improvements along with 2.6's performance and scaling improvements enable Linux to continue to expand into other areas of deployment. For example, the same general-purpose Linux kernel used from small-to-large or enterprise-to-HPC servers can now be also deployed and used in real-time applications providing support and capabilities previously found only on proprietary or specialized real-time operating systems.

HPCwire: What elements of the Linux HPC work are done by SGI versus others in the community?

Neuner: While SGI often focuses on HPC and I/O related kernel issues, it's not unusual for us to encounter a problem that's already being worked on or addressed by someone in the community, since many performance, error handling and robustness improvements needed for HPC environments also benefit or affect enterprise environments.

However, our access and usage of very large systems also means we are first to find various HPC, scaling or performance related problems. This is due to the fact that one of the best ways to shake out and find problems faster is to “turn up the stress knobs” on a system by using very large system configurations for testing, so systems with large amounts of processors, memory, and I/O are crucial and heavily relied upon for all our kernel development and testing.

Also, as community acceptance is critical to all kernel work SGI does, virtually all of the work we do involves collaboration with some subset of the Linux community.

HPCwire: Do you think the open source nature of Linux has speeded development of HPC OS features or made it a more complex undertaking?

Neuner: At SGI, OS engineers continue to work on kernel issues and improvements on Linux as we did on IRIX. The main difference now is how we deliver these improvements to our customers. Seeking acceptance and agreement on a proposed change from others within the Linux community seemed like an extra hurdle at first, but over time it became clear that this collaboration combined with the high quality standards is why Linux has become highly versatile, robust, and stable for all workload environments including HPC. The Linux community software development model enables our customers to benefit from improvements made by the entire Linux community rather than just improvements made by SGI engineers.

HPCwire: What are the practical limits for single system image scalability? Are they inherent in the kernel design or just the result of hardware limitations?

Neuner: The hardware, OS, and HPC application all need to scale in order for users to see the performance gains from adding more processors to their system. With HPC applications, scaling can occur in two ways. The first is with the already numerous existing “embarrassingly parallel” applications that are ready to exploit large CPU counts using the hardware as a “capability server.” The second way is when a system is used as a “capacity server,” where multiple applications each use only a subset of the total available processors. Either way, many HPC applications and environments can usually take advantage of a larger system when more processors are added.

For hardware, SGI systems are designed with hardware scalability and performance as paramount. The operating system scalability typically lags behind, especially since one really needs to get access to the hardware first in order to go after and solve the OS issues. The hardware limit for our current generation of Altix is 4096 processors for running a single system image of the operating system.

With the operating system, the practical limit is hit when a highly specialized, light-weight, and dedicated operating system customized for a specific hardware architecture must be used over a general purpose one. Today, SGI uses the same general purpose Linux kernel whether running with 2 or 1024 processors — which is incredible and a testament to the excellent design and work by everyone within the Linux community.

We've already successfully booted Linux in the lab on 1742 processors, at which point we encountered more internal kernel issues that will need to be addressed, so it's an on-going process and impossible to predict the upper limit for Linux, given its impressive track record.

—–

Steve Neuner is the Linux Engineering Director at SGI and has been working on Linux and Itanium-based systems since joining SGI 7 years ago. Prior to SGI, Steve worked at Digital Equipment Corporation, Sequent Computer Systems, and MAI Basic Four. He has been involved with Linux and UNIX kernel development for over 20 years.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industry updates delivered to you every week!

2024 Winter Classic: Meet Team Morehouse

April 17, 2024

Morehouse College? The university is well-known for their long list of illustrious graduates, the rigor of their academics, and the quality of the instruction. They were one of the first schools to sign up for the Winter Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pressing needs and hurdles to widespread AI adoption. The sudde Read more…

Quantinuum Reports 99.9% 2-Qubit Gate Fidelity, Caps Eventful 2 Months

April 16, 2024

March and April have been good months for Quantinuum, which today released a blog announcing the ion trap quantum computer specialist has achieved a 99.9% (three nines) two-qubit gate fidelity on its H1 system. The lates Read more…

Mystery Solved: Intel’s Former HPC Chief Now Running Software Engineering Group 

April 15, 2024

Last year, Jeff McVeigh, Intel's readily available leader of the high-performance computing group, suddenly went silent, with no interviews granted or appearances at press conferences.  It led to questions -- what's Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Institute for Human-Centered AI (HAI) put out a yearly report to t Read more…

Crossing the Quantum Threshold: The Path to 10,000 Qubits

April 15, 2024

Editor’s Note: Why do qubit count and quality matter? What’s the difference between physical qubits and logical qubits? Quantum computer vendors toss these terms and numbers around as indicators of the strengths of t Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics  — announce Read more…

Nvidia’s GTC Is the New Intel IDF

April 9, 2024

After many years, Nvidia's GPU Technology Conference (GTC) was back in person and has become the conference for those who care about semiconductors and AI. I Read more…

Google Announces Homegrown ARM-based CPUs 

April 9, 2024

Google sprang a surprise at the ongoing Google Next Cloud conference by introducing its own ARM-based CPU called Axion, which will be offered to customers in it Read more…

Computational Chemistry Needs To Be Sustainable, Too

April 8, 2024

A diverse group of computational chemists is encouraging the research community to embrace a sustainable software ecosystem. That's the message behind a recent Read more…

Hyperion Research: Eleven HPC Predictions for 2024

April 4, 2024

HPCwire is happy to announce a new series with Hyperion Research  - a fact-based market research firm focusing on the HPC market. In addition to providing mark Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

DoD Takes a Long View of Quantum Computing

December 19, 2023

Given the large sums tied to expensive weapon systems – think $100-million-plus per F-35 fighter – it’s easy to forget the U.S. Department of Defense is a Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Shutterstock 1179408610

Google Addresses the Mysteries of Its Hypercomputer 

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

Leading Solution Providers

Contributors

AMD MI3000A

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Shutterstock 1606064203

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

Shutterstock 1285747942

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

Intel’s Xeon General Manager Talks about Server Chips 

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire