Ubiquitous Parallelism and the Classroom

By Tom Murphy of Contra Costa College, Paul Gray of the University of Northern Iowa, Charlie Peck of Earlham College, and Dave Joiner of Kean University

November 20, 2009

The oft-contended best simple statement is that we need ubiquitous parallelism in the classroom. Once upon a time, it was solely the lunatic fringe, programming esoteric architectures squirreled away in very special corners of the globe that cared about parallelism. In the near future, most electronic devices will have multiple cores which would benefit greatly from parallel programming. The low hanging fruit is, of course, the student’s laptop, and aiding the student to make full use of that laptop.

So how do we get there?

Our perception of next steps comes from close to a decade of collaboration pushing parallel and distributed computing education. This doesn’t mean we are right, just that we have been walking the walk. Three of the four of us are computer scientists and Dave, our physicist, is essentially also one (of course he claims that we’re all physicists). The bulk of our time together, outside of our respective day jobs teaching, is spent leading week-long workshops for faculty – largely focused on the teaching of parallel and distributed programming and computational thinking. Our assertion is this: As computer architectures evolve from single core to multicore to manycore, the computer science curriculum must experience a commensurate single-course to multi-course to many-course evolution in terms of where parallelism is studied.

Thus, you’re probably not surprised we’re saying faculty education is the key way to get from here to there, using as many modes of conveyance as possible. For teaching parallelism in our courses, few of us CS educators have learned what we have needed from our own formal education. We possess a self-taught science/art crafted via the hands-on hard-knock cycles of design, debugging, and despair which provided us with rich learning opportunities. This highlights the goals we have for our students: theory tightly coupled with the pragmatic skills of the practiced practitioner, learned via the cycles of design, debugging, and despair. Note that performance programming is wonderfully resurfacing in importance, for if you don’t need performance, why bother with the complexity of a parallel solution? Just run on your friendly neighborhood SMP or NUMA architecture, which will suffice as a first order solution for many problems. It was performance parallel programming that put the ‘L’ in lunatic fringe, and to raise ‘L’, we will ultimately need to examine the isolated graduate and undergraduate courses and weave the key components of parallelism into the fabric of all computer science courses beginning at the earliest level.

So let’s get specific on possibilities for the first courses at the undergraduate level. The core of CS1 typically starts with the nomenclature, theory, and components of a simple algorithm and a basic block of execution. Flow of control is our next extension: branches, loops, and functions. Parallelism is easily a natural next layer. When we invoke parallelism, we might demonstrate by conjuring with threads and shared memory, since the use of shared memory will not perturb the student’s simple notion of array-like memory. Additionally, the most frequently used shared memory mechanism, OpenMP, allows a gradual move from pure von-Neumann towards “pure” shared memory parallelism. This will cover fine-grain parallelism. A hunger for a different course of studies leads to the course-grained approach of distributed memory parallelism with MPI. Larger scale parallelism is naturally necessarily discovered by students as the problems of interest continue to grow.

The legal battlefield of Amdahl and Gustafson is a good next stop, guiding us into the study of data structures and algorithms via a perilous path littered with algorithms which scale poorly. Unchecked and unplanned parallelism will lead us to throttled resources whether Von Neumann’s bottleneck or the more insidious communication costs incurred when trying to tame a parallel algorithm. Students can learn of dwarvish parallel patterns and associated phenomena such as a sequentially elegant quicksort quickly foundering in the presence of unamortized distributed memory costs.

This is a good time to consider how to squeeze weeks and weeks of new material on parallelism into a semester. Something has to give and something will give, but this is not a new dilemma. It is something we each faced when first crafting what we will cover in a course. It is something we face to a greater or lesser extent every time we re-teach a course given the pace of change in our discipline.

Now it is time for an anecdote. Tom interviewed Dave Paterson as part of the “Teach Parallel” series of interviews. The interview ranged over many topics, one of which was Dave’s fourth edition of “Computer Organization and Design”, which gloriously has parallel topics woven into each chapter. This led to talking with Dave’s publisher about targeting an adaptation of the book towards community colleges, such as Contra Costa College where Tom teaches. The publisher was surprised to learn no dilution of the 703 pages was desired. Tom plans to cherry pick the material to use in his Computer Architecture course, which is a continuation of an experiment he’s been running in all his courses, which allows the entire book is covered, just at varying depths. It is important for Tom to convey how to be a good student, part of which is being able to self-learn from practitioners’ resources. This raises a good point: more textbook support for parallelism is going to make this whole process a heck of a lot easier. Unfortunately, it takes awhile to prime the curricular pump.

Computer architecture has traditionally incorporated elements of parallelism and concurrency; via semaphores and atomic operations, pipelines and multiple functional units, SMP architectures, and instruction and data paths. It has always been the place where the key hardware issues of the current architectures inform the software designed to run on it.

There are no easy answers, but there really are clear steps. We need to help students get to a place where they think of a single processing unit as just a special case of multiple processing units, much like they now learn to view a single variable as a special case of an array.

About the Authors

Thomas Murphy is a professor of Computer Science at Contra Costa College (CCC). He is chair of the CCC Computer Science program and is director of the CCC High Performance Computing Center, which has supported both the Linux cluster administration program and the computational science education program. Thomas has worked with the National Computational Science Institute (NCSI) since 2002. He is one of four members of the NCSI Parallel and Distributed Working group, which presents several three to seven day workshops each year, and helps develop the Bootable Cluster CD software platform, the LittleFe hardware platform, and the CSERD (Computational Science Education Reference Desk) curricular platform.

Paul Gray is an Associate Professor of Computer Science at the University of Northern Iowa. He created the Bootable Cluster CD project (http://bccd.net/) and provides instructional support for the National Computational Sciences Institute summer workshops on Cluster and Parallel Computing. He was SC08 Education Program Chair and serves on the executive committee for the SC07-11 Education Program.

Charlie Peck is the leader of the The Cluster Computing Group (CCG) at Earlham College, a student/faculty research group in the Computer Science department. The CCG is the primary design and engineering team for LittleFe, developers of computational science software, e.g., [email protected], and technical contributors to Paul Gray’s Bootable Cluster CD project. Additionally, Charlie is the primary developer on the LittleFe project.

Dave Joiner is an assistant professor of Computational Mathematics in the New Jersey Center for Science, Technology, and Mathematics Education. The NJCSTME focuses on the training of science and math teachers with an integrated view of modern math, science, and computing. Additionally, Dave has collaborated since 1999 with the efforts of the Shodor Education Foundation, Inc., and the National Computational Science Institute.  He currently serves as a Co-PI on the Computational Science Education Reference Desk, the Pathway of the National Science Digital Library devoted to computational science education.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the global stage. Now, the Mohammed VI Polytechnic University (U Read more…

By Oliver Peckham

Supercomputer-Powered Machine Learning Supports Fusion Energy Reactor Design

February 25, 2021

Energy researchers have been reaching for the stars for decades in their attempt to artificially recreate a stable fusion energy reactor. If successful, such a reactor would revolutionize the world’s energy supply over Read more…

By Oliver Peckham

Japan to Debut Integrated Fujitsu HPC/AI Supercomputer This Spring

February 25, 2021

The integrated Fujitsu HPC/AI Supercomputer, Wisteria, is coming to Japan this spring. The University of Tokyo is preparing to deploy a heterogeneous computing system, called "Wisteria/BDEC-01," that will tackle simulati Read more…

By Tiffany Trader

President Biden Signs Executive Order to Review Chip, Other Supply Chains

February 24, 2021

U.S. President Biden signed an executive order late today calling for a 100-day review of key supply chains including semiconductors, large capacity batteries, pharmaceuticals, and rare-earth elements. The scarcity of ch Read more…

By John Russell

Xilinx Launches Alveo SN1000 SmartNIC

February 24, 2021

FPGA vendor Xilinx has debuted its latest SmartNIC model, the Alveo SN1000, with integrated “composability” features that allow enterprise users to add their own custom networking functions to supplement its built-in networking. By providing deep flexibility... Read more…

By Todd R. Weiss

AWS Solution Channel

Introducing AWS HPC Tech Shorts

Amazon Web Services (AWS) is excited to announce a new videos series focused on running HPC workloads on AWS. This new video series will cover HPC workloads from genomics, computational chemistry, to computational fluid dynamics (CFD) and more. Read more…

ASF Keynotes Showcase How HPC and Big Data Have Pervaded the Pandemic

February 24, 2021

Last Thursday, a range of experts joined the Advanced Scale Forum (ASF) in a rapid-fire roundtable to discuss how advanced technologies have transformed the way humanity responded to the COVID-19 pandemic in indelible ways. The roundtable, held near the one-year mark of the first... Read more…

By Oliver Peckham

Japan to Debut Integrated Fujitsu HPC/AI Supercomputer This Spring

February 25, 2021

The integrated Fujitsu HPC/AI Supercomputer, Wisteria, is coming to Japan this spring. The University of Tokyo is preparing to deploy a heterogeneous computing Read more…

By Tiffany Trader

Xilinx Launches Alveo SN1000 SmartNIC

February 24, 2021

FPGA vendor Xilinx has debuted its latest SmartNIC model, the Alveo SN1000, with integrated “composability” features that allow enterprise users to add their own custom networking functions to supplement its built-in networking. By providing deep flexibility... Read more…

By Todd R. Weiss

ASF Keynotes Showcase How HPC and Big Data Have Pervaded the Pandemic

February 24, 2021

Last Thursday, a range of experts joined the Advanced Scale Forum (ASF) in a rapid-fire roundtable to discuss how advanced technologies have transformed the way humanity responded to the COVID-19 pandemic in indelible ways. The roundtable, held near the one-year mark of the first... Read more…

By Oliver Peckham

IBM’s Prototype Low-Power 7nm AI Chip Offers ‘Precision Scaling’

February 23, 2021

IBM has released details of a prototype AI chip geared toward low-precision training and inference across different AI model types while retaining model quality within AI applications. In a paper delivered during this year’s International Solid-State Circuits Virtual Conference, IBM... Read more…

By George Leopold

IBM Continues Mainstreaming Power Systems and Integrating Red Hat in Pivot to Cloud

February 23, 2021

As IBM continues its massive pivot to the cloud, its Power-microprocessor-based products are being mainstreamed and realigned with the corporate-wide strategy. Read more…

By John Russell

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

By Tiffany Trader

ENIAC at 75: Celebrating the World’s First Supercomputer

February 15, 2021

With little fanfare, today’s computer revolution was arguably born and announced through a small, innocuous, two-column story at the bottom of the front page of The New York Times on Feb. 15, 1946. In that story and others, the previously classified project, ENIAC... Read more…

By Todd R. Weiss

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

By Todd R. Weiss

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

By John Russell

Esperanto Unveils ML Chip with Nearly 1,100 RISC-V Cores

December 8, 2020

At the RISC-V Summit today, Art Swift, CEO of Esperanto Technologies, announced a new, RISC-V based chip aimed at machine learning and containing nearly 1,100 low-power cores based on the open-source RISC-V architecture. Esperanto Technologies, headquartered in... Read more…

By Oliver Peckham

Azure Scaled to Record 86,400 Cores for Molecular Dynamics

November 20, 2020

A new record for HPC scaling on the public cloud has been achieved on Microsoft Azure. Led by Dr. Jer-Ming Chia, the cloud provider partnered with the Beckman I Read more…

By Oliver Peckham

NICS Unleashes ‘Kraken’ Supercomputer

April 4, 2008

A Cray XT4 supercomputer, dubbed Kraken, is scheduled to come online in mid-summer at the National Institute for Computational Sciences (NICS). The soon-to-be petascale system, and the resulting NICS organization, are the result of an NSF Track II award of $65 million to the University of Tennessee and its partners to provide next-generation supercomputing for the nation's science community. Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

By Tracey Bryant

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Top500: Fugaku Keeps Crown, Nvidia’s Selene Climbs to #5

November 16, 2020

With the publication of the 56th Top500 list today from SC20's virtual proceedings, Japan's Fugaku supercomputer – now fully deployed – notches another win, Read more…

By Tiffany Trader

Gordon Bell Special Prize Goes to Massive SARS-CoV-2 Simulations

November 19, 2020

2020 has proven a harrowing year – but it has produced remarkable heroes. To that end, this year, the Association for Computing Machinery (ACM) introduced the Read more…

By Oliver Peckham

Leading Solution Providers

Contributors

Texas A&M Announces Flagship ‘Grace’ Supercomputer

November 9, 2020

Texas A&M University has announced its next flagship system: Grace. The new supercomputer, named for legendary programming pioneer Grace Hopper, is replacing the Ada system (itself named for mathematician Ada Lovelace) as the primary workhorse for Texas A&M’s High Performance Research Computing (HPRC). Read more…

By Oliver Peckham

At Oak Ridge, ‘End of Life’ Sometimes Isn’t

October 31, 2020

Sometimes, the old dog actually does go live on a farm. HPC systems are often cursed with short lifespans, as they are continually supplanted by the latest and Read more…

By Oliver Peckham

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

By Oliver Peckham

Intel Xe-HP GPU Deployed for Aurora Exascale Development

November 17, 2020

At SC20, Intel announced that it is making its Xe-HP high performance discrete GPUs available to early access developers. Notably, the new chips have been deplo Read more…

By Tiffany Trader

Intel Teases Ice Lake-SP, Shows Competitive Benchmarking

November 17, 2020

At SC20 this week, Intel teased its forthcoming third-generation Xeon "Ice Lake-SP" server processor, claiming competitive benchmarking results against AMD's second-generation Epyc "Rome" processor. Ice Lake-SP, Intel's first server processor with 10nm technology... Read more…

By Tiffany Trader

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

It’s Fugaku vs. COVID-19: How the World’s Top Supercomputer Is Shaping Our New Normal

November 9, 2020

Fugaku is currently the most powerful publicly ranked supercomputer in the world – but we weren’t supposed to have it yet. The supercomputer, situated at Japan’s Riken scientific research institute, was scheduled to come online in 2021. When the pandemic struck... Read more…

By Oliver Peckham

MIT Makes a Big Breakthrough in Nonsilicon Transistors

December 10, 2020

What if Silicon Valley moved beyond silicon? In the 80’s, Seymour Cray was asking the same question, delivering at Supercomputing 1988 a talk titled “What’s All This About Gallium Arsenide?” The supercomputing legend intended to make gallium arsenide (GaA) the material of the future... Read more…

By Oliver Peckham

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire