Blue Waters Project Announces Introduction to HPC Graduate Course

March 23, 2016

March 23 — The Blue Waters project at the University of Illinois is pleased to announce the offering of a graduate course Introduction to High Performance Computing that will be offered as a collaborative, online course for multiple participating institutions. We are seeking other university partners that are interested in offering the course for credit to their students. The course includes online video lectures, quizzes, and homework assignments with access to free accounts on the Blue Waters system.

Participating institutions will need to provide a local instructor that will be responsible for advising the local students and officially assigning grades. Students will complete the online course exams and exercises as part of their grade.

The instructor for the course is Dr. David E. Keyes, Director of the Extreme Computing Research and Founding Dean of the Mathematical and Computer Sciences and Engineering Division at the King Abdullah University of Science and Technology (KAUST).

Prerequisites for the graduate students include:

  • Experience working in a Unix environment
  • Experience developing and running scientific codes written in C or C++
  • Familiarity with basic numerical algorithms and basic computer architecture

The expectations for students, faculty, and the instruction team are noted below. Interested faculty should contact Steve Gordon, organizer of the Blue Waters course program at [email protected] or by phone at 614-292-4132.

Expectations for Participants

The expectations of the “collaborating faculty” are that they will:

  • Establish a “collaborating course” (possibly a special topics course) on the autumn course catalog
  • Promote this course to students on their own campus
  • View the recorded lectures together with their local enrolled students
  • Provide office hours to advise the students on the course content
  • Proctor the course exam
  • Provide regular feedback on behalf of the students to Dr. Hwu on the course throughout the semester

The expectations of Dr. Keyes and the O2PEP team are that they will:

  • Provide an initial live web-cast to introduce the instructor, TAs, support staff, and introduce remote participants and faculty to one another
  • Provide two recorded lectures per week
  • Provide exercises and activities for the students
  • Provide a web space for all course related materials
  • Provide regular quizzes to allow the students to assess their own progress
  • Provide a mid-term exam and a final exam
  • Grade all the quizzes and exams
  • Provide TAs to assist all students with questions about the course content, exercises, quizzes, and other materials covered during the semester
  • Conduct an evaluation of the course with the participants and collaborating faculty

Expectations of the Students

  • Students must register in a “collaborating course” on their own campus
  • Students will need their own laptop or desktop system
  • Students are expected to view the recorded lectures as a group with their local “collaborating faculty” to learn/discuss the content as a group
  • Students are expected to contact the TAs at the University of Illinois for in-depth questions about the content, exercises, or other materials
  • Students will be asked to submit quizzes for self-assessment purposes
  • Students will be asked to submit a mid-term and a final exam for determining a grade, with a scale applied according to their own campus grading methods

Course Description

High performance computing algorithms and software technology, with an emphasis on using distributed memory systems for scientific computing. Theoretical and practically achievable performance for processors, memory system, and network, for large-scale scientific applications. The state-of-the-art and promise of predictive computational science and engineering. Algorithmic kernels common to linear and nonlinear algebraic systems, partial differential equations, integral equations, particle methods, optimization, and statistics. Computer architecture and the stresses put on scientific applications and their underlying mathematical algorithms by emerging architecture. State-of-the-art discretization techniques, solver libraries, and execution frameworks.

Prerequisites

Experience using C/C++ in a Unix environment, familiarity with basic numerical algorithms, and familiarity with computer architecture.

Course Flavor

A good subtitle for this course would be “Algorithms as if architecture mattered.” Architecture increasingly does matter today. During decades of progress using the paradigm of bulk synchronous processing on systems that were small enough to be considered “flat” and tightly coupled, architecture could largely be abstracted away through the message passing interface (MPI), an excellent example of “separation of concerns” in computer science. One could write in a high-level language without concern about where the compiler and runtime stashed the operands, because flops were relatively slow, which made everything else, including the physical layout of the architecture, appear nearly flat. One could count flops for serial complexity estimation, and determine how many could be done concurrently (between synchronization events) for parallel complexity estimation. Today, however, flops are cheap compared to the cost of moving data, in both time and energy expenditure. Therefore, we must worry about the topology of the network and the latencies and bandwidths of every part of the memory system and network in getting the operands to the FPUs. This gives high performance computing an emphasis different from some other types of computing. The same architecture advances that make it frustrating also make it exciting! What new high performance science and engineering computing users need are an introduction to the concepts, the hardware and software environments, and selected algorithms and applications of parallel scientific computing, with an emphasis on tightly coupled computations that are capable of scaling to thousands of processors and well beyond. The course material ranges (selectively) from high-level descriptions of motivating applications to low-level details of implementation, in order to expose the algorithmic kernels and the shifting balances of computation and communication between them. The homeworks range from simple theoretical studies to running and modifying demonstration codes. Modest programming assignments using MPI and PETSc culminate in an independent project leading to an in-class report.

Instructors

The principal lecturer will be David Keyes, Professor of Applied Mathematics and Computational Science, KAUST. Guest lecturers will be invited to speak on their specialties. Lectures from Extreme Computing Research Center staff members highlighting open source scientific software will be incorporated into the course.

Goals and Syllabus

The overall goal is to acquaint students who anticipate doing independent work that may benefit from large-scale simulation with current hardware, software tools, practices, and trends in parallel scientific computing, and to provide an opportunity to build and execute sample parallel codes. The software employed in course examples is freely available. The course is also designed to make students intelligent consumers and critics of parallel scientific computing literature and conferences.

Much of the motivation for parallel scientific computing comes from simulations based on discretizations of partial differential equations (PDEs, typically described with sparse matrices), or integral equations (IEs, typically described with dense matrices), or based on interacting particles (unstructured interaction lists, often embedded in octtrees). Of course, many applications are nonlinear, but these are typically approached as a series of linearized analyses. An understanding of the underlying equations, their physical meaning, and their mathematical analysis is important in some parts of the course and opens up many possibilities for independent projects. Other material is easily abstracted away from its underlying operator equation context to that of a generic bulk-synchronous computation that interleaves flows of data with operations on that data. The intention is to provide a course of benefit to a broad clientele of graduate researchers. In addition to computer scientists and applied mathematicians, students from mechanical engineering, electrical engineering, chemical engineering, materials science, and geophysics should find it of interest and approachable if they already have sufficient background in computing to be motivated towards the high end.

Thirteen algorithmic prototypes that occur regularly in scientific computing have been identified in a famous 2006 Berkeley technical report “The Landscape of Parallel Computing Research: The View from Berkeley” (UCB/EECS-2006-183). Though ten years old, students may want to download and devour this report as representative of the motivation and flavor of the course. The Berkeley prototypes are: dense direct solvers, sparse direct solvers, spectral methods, N-body methods, structured grids / iterative solvers, unstructured grids / iterative solvers, Monte Carlo (including “MapReduce”), combinatorial logic, graph traversal, graphical models, finite state machines, dynamic programming, backtrack/branch-and-bound. The first seven are essential floating point kernels and the last six essential integer kernels. The course examines several of these kernels in detail.

Lecture Coverage Includes:

  • Introduction to large-scale predictive simulations: the combined culture of CS&E and HPC
  • Introduction to parallel architecture and programming models
  • Introduction to MPI, PETSc, and other software frameworks for HPC
  • Parallel algorithms for the solution of large, sparse linear systems and nonlinear systems with large, sparse Jacobians
  • Parallel algorithms for partial differential equations
  • Parallel algorithms for N-body particle dynamics

Evaluation and Grading

Evaluation consists of four components: problem sets, project, final exam, and class participation at the flipped local site. Problem sets may be undertaken cooperatively (and this is encouraged), but each student must submit the homework separately under their own name, vouching for their own responsibility for the answers. The quality of the write-up is part of the grade. It is intended that all students should be able to score well on the problem sets, because they will be announced well in advance of their due dates and students have unlimited time for their own reading and research of the topics consultations with one another. The problem sets should create an extended ongoing discussion for the class community. The project is intended to be individual. If students want to team to undertake a “bigger” project and earn the same grade for it, this should be negotiated when projects are launched in mid-course. Projects will be submitted in report form, and each project will be featured for a short presentation to the class at the end of the semester. The final exam is, of course, individual.

Frequently Asked Questions

Must I understand PDEs and Linear Algebra well to take this course?

Algorithms for partial differential equation and linear algebraic computations motivate this course and add knowledge of their mathematics adds substance to the parallel applications. However, the aspects of these subjects that are important to success in this course have to do with understanding the choreography of data and hardware. If you are comfortable with following the data in these algorithms without a theoretical understanding of how they approximate the real world (modeling) or how rapidly they converge to it (analysis), you can survive this course and even excel in it. Mathematical theorems, e.g., tying convergence of an iterative method to condition number of a matrix have a quality of subroutines: if the upstream hypotheses (inputs) are verified, the consequences (outputs) may be chained into downstream uses in this course, e.g., complexity analyses.

Must I be facile in Unix and C/C++ to take this course?

In this course, you will work with sample applications written in C and you will build and execute on Linux-based distributed systems. One can pick up what one needs without being an expert in the tools applied.

Do you have a motto for success in difficult endeavors like high performance computing?

Actually, this is not a frequently asked question, but it should be. I do have a motto, taken from the most successful college football coach in history, Bear Bryant (1913—1983), as measured by the number of career wins amassed: “It’s not the will to win, but the will to prepare to win that makes the difference.”

Source: Ohio Supercomputer Center

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Watch Nvidia’s GTC21 Keynote with Jensen Huang Livestreamed Here at HPCwire

April 9, 2021

Join HPCwire right here on Monday, April 12, at 8:30 am PT to see the Nvidia GTC21 keynote from Nvidia’s CEO, Jensen Huang, livestreamed in its entirety. Hosted by HPCwire, you can click to join the Huang keynote on our livestream to hear Nvidia’s expected news and... Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U.S. Entity List bars U.S. firms from supplying key technolog Read more…

Argonne Supercomputing Supports Caterpillar Engine Design

April 8, 2021

Diesel fuels still account for nearly ten percent of all energy-related U.S. carbon emissions – most of them from heavy-duty vehicles like trucks and construction equipment. Energy efficiency is key to these machines, Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new training and inference servers that will power the upcoming Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

AWS Solution Channel

Volkswagen Passenger Cars Uses NICE DCV for High-Performance 3D Remote Visualization

 

Volkswagen Passenger Cars has been one of the world’s largest car manufacturers for over 70 years. The company delivers more than 6 million automobiles to global customers every year, from 50 production locations on five continents. Read more…

What’s New in HPC Research: Tundra, Fugaku, µHPC & More

April 6, 2021

In this regular feature, HPCwire highlights newly published research in the high-performance computing community and related domains. From parallel programming to exascale to quantum computing, the details are here. Read more…

The US Places Seven Additional Chinese Supercomputing Entities on Blacklist

April 8, 2021

As tensions between the U.S. and China continue to simmer, the U.S. government today added seven Chinese supercomputing entities to an economic blacklist. The U Read more…

Habana’s AI Silicon Comes to San Diego Supercomputer Center

April 8, 2021

Habana Labs, an Intel-owned AI company, has partnered with server maker Supermicro to provide high-performance, high-efficiency AI computing in the form of new Read more…

Intel Partners Debut Latest Servers Based on the New Intel Gen 3 ‘Ice Lake’ Xeons

April 7, 2021

Fresh from Intel’s launch of the company’s latest third-generation Xeon Scalable “Ice Lake” processors on April 6 (Tuesday), Intel server partners Cisco, Dell EMC, HPE and Lenovo simultaneously unveiled their first server models built around the latest chips. And though arch-rival AMD may... Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

RIKEN’s Ongoing COVID Research Includes New Vaccines, New Tests & More

April 6, 2021

RIKEN took the supercomputing world by storm last summer when it launched Fugaku – which became (and remains) the world’s most powerful supercomputer – ne Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

AI Systems Summit Keynote: Brace for System Level Heterogeneity Says de Supinski

April 1, 2021

Heterogeneous computing has quickly come to mean packing a couple of CPUs and one-or-many accelerators, mostly GPUs, onto the same node. Today, a one-such-node system has become the standard AI server offered by dozens of vendors. This is not to diminish the many advances... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

CERN Is Betting Big on Exascale

April 1, 2021

The European Organization for Nuclear Research (CERN) involves 23 countries, 15,000 researchers, billions of dollars a year, and the biggest machine in the worl Read more…

Programming the Soon-to-Be World’s Fastest Supercomputer, Frontier

January 5, 2021

What’s it like designing an app for the world’s fastest supercomputer, set to come online in the United States in 2021? The University of Delaware’s Sunita Chandrasekaran is leading an elite international team in just that task. Chandrasekaran, assistant professor of computer and information sciences, recently was named... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Saudi Aramco Unveils Dammam 7, Its New Top Ten Supercomputer

January 21, 2021

By revenue, oil and gas giant Saudi Aramco is one of the largest companies in the world, and it has historically employed commensurate amounts of supercomputing Read more…

Quantum Computer Start-up IonQ Plans IPO via SPAC

March 8, 2021

IonQ, a Maryland-based quantum computing start-up working with ion trap technology, plans to go public via a Special Purpose Acquisition Company (SPAC) merger a Read more…

Leading Solution Providers

Contributors

Can Deep Learning Replace Numerical Weather Prediction?

March 3, 2021

Numerical weather prediction (NWP) is a mainstay of supercomputing. Some of the first applications of the first supercomputers dealt with climate modeling, and Read more…

Livermore’s El Capitan Supercomputer to Debut HPE ‘Rabbit’ Near Node Local Storage

February 18, 2021

A near node local storage innovation called Rabbit factored heavily into Lawrence Livermore National Laboratory’s decision to select Cray’s proposal for its CORAL-2 machine, the lab’s first exascale-class supercomputer, El Capitan. Details of this new storage technology were revealed... Read more…

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

African Supercomputing Center Inaugurates ‘Toubkal,’ Most Powerful Supercomputer on the Continent

February 25, 2021

Historically, Africa hasn’t exactly been synonymous with supercomputing. There are only a handful of supercomputers on the continent, with few ranking on the Read more…

The History of Supercomputing vs. COVID-19

March 9, 2021

The COVID-19 pandemic poses a greater challenge to the high-performance computing community than any before. HPCwire's coverage of the supercomputing response t Read more…

HPE Names Justin Hotard New HPC Chief as Pete Ungaro Departs

March 2, 2021

HPE CEO Antonio Neri announced today (March 2, 2021) the appointment of Justin Hotard as general manager of HPC, mission critical solutions and labs, effective Read more…

Microsoft, HPE Bringing AI, Edge, Cloud to Earth Orbit in Preparation for Mars Missions

February 12, 2021

The International Space Station will soon get a delivery of powerful AI, edge and cloud computing tools from HPE and Microsoft Azure to expand technology experi Read more…

AMD Launches Epyc ‘Milan’ with 19 SKUs for HPC, Enterprise and Hyperscale

March 15, 2021

At a virtual launch event held today (Monday), AMD revealed its third-generation Epyc “Milan” CPU lineup: a set of 19 SKUs -- including the flagship 64-core, 280-watt 7763 part --  aimed at HPC, enterprise and cloud workloads. Notably, the third-gen Epyc Milan chips achieve 19 percent... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire