Blue Waters Project Announces Introduction to HPC Graduate Course

March 23, 2016

March 23 — The Blue Waters project at the University of Illinois is pleased to announce the offering of a graduate course Introduction to High Performance Computing that will be offered as a collaborative, online course for multiple participating institutions. We are seeking other university partners that are interested in offering the course for credit to their students. The course includes online video lectures, quizzes, and homework assignments with access to free accounts on the Blue Waters system.

Participating institutions will need to provide a local instructor that will be responsible for advising the local students and officially assigning grades. Students will complete the online course exams and exercises as part of their grade.

The instructor for the course is Dr. David E. Keyes, Director of the Extreme Computing Research and Founding Dean of the Mathematical and Computer Sciences and Engineering Division at the King Abdullah University of Science and Technology (KAUST).

Prerequisites for the graduate students include:

  • Experience working in a Unix environment
  • Experience developing and running scientific codes written in C or C++
  • Familiarity with basic numerical algorithms and basic computer architecture

The expectations for students, faculty, and the instruction team are noted below. Interested faculty should contact Steve Gordon, organizer of the Blue Waters course program at sgordon@osc.edu or by phone at 614-292-4132.

Expectations for Participants

The expectations of the “collaborating faculty” are that they will:

  • Establish a “collaborating course” (possibly a special topics course) on the autumn course catalog
  • Promote this course to students on their own campus
  • View the recorded lectures together with their local enrolled students
  • Provide office hours to advise the students on the course content
  • Proctor the course exam
  • Provide regular feedback on behalf of the students to Dr. Hwu on the course throughout the semester

The expectations of Dr. Keyes and the O2PEP team are that they will:

  • Provide an initial live web-cast to introduce the instructor, TAs, support staff, and introduce remote participants and faculty to one another
  • Provide two recorded lectures per week
  • Provide exercises and activities for the students
  • Provide a web space for all course related materials
  • Provide regular quizzes to allow the students to assess their own progress
  • Provide a mid-term exam and a final exam
  • Grade all the quizzes and exams
  • Provide TAs to assist all students with questions about the course content, exercises, quizzes, and other materials covered during the semester
  • Conduct an evaluation of the course with the participants and collaborating faculty

Expectations of the Students

  • Students must register in a “collaborating course” on their own campus
  • Students will need their own laptop or desktop system
  • Students are expected to view the recorded lectures as a group with their local “collaborating faculty” to learn/discuss the content as a group
  • Students are expected to contact the TAs at the University of Illinois for in-depth questions about the content, exercises, or other materials
  • Students will be asked to submit quizzes for self-assessment purposes
  • Students will be asked to submit a mid-term and a final exam for determining a grade, with a scale applied according to their own campus grading methods

Course Description

High performance computing algorithms and software technology, with an emphasis on using distributed memory systems for scientific computing. Theoretical and practically achievable performance for processors, memory system, and network, for large-scale scientific applications. The state-of-the-art and promise of predictive computational science and engineering. Algorithmic kernels common to linear and nonlinear algebraic systems, partial differential equations, integral equations, particle methods, optimization, and statistics. Computer architecture and the stresses put on scientific applications and their underlying mathematical algorithms by emerging architecture. State-of-the-art discretization techniques, solver libraries, and execution frameworks.

Prerequisites

Experience using C/C++ in a Unix environment, familiarity with basic numerical algorithms, and familiarity with computer architecture.

Course Flavor

A good subtitle for this course would be “Algorithms as if architecture mattered.” Architecture increasingly does matter today. During decades of progress using the paradigm of bulk synchronous processing on systems that were small enough to be considered “flat” and tightly coupled, architecture could largely be abstracted away through the message passing interface (MPI), an excellent example of “separation of concerns” in computer science. One could write in a high-level language without concern about where the compiler and runtime stashed the operands, because flops were relatively slow, which made everything else, including the physical layout of the architecture, appear nearly flat. One could count flops for serial complexity estimation, and determine how many could be done concurrently (between synchronization events) for parallel complexity estimation. Today, however, flops are cheap compared to the cost of moving data, in both time and energy expenditure. Therefore, we must worry about the topology of the network and the latencies and bandwidths of every part of the memory system and network in getting the operands to the FPUs. This gives high performance computing an emphasis different from some other types of computing. The same architecture advances that make it frustrating also make it exciting! What new high performance science and engineering computing users need are an introduction to the concepts, the hardware and software environments, and selected algorithms and applications of parallel scientific computing, with an emphasis on tightly coupled computations that are capable of scaling to thousands of processors and well beyond. The course material ranges (selectively) from high-level descriptions of motivating applications to low-level details of implementation, in order to expose the algorithmic kernels and the shifting balances of computation and communication between them. The homeworks range from simple theoretical studies to running and modifying demonstration codes. Modest programming assignments using MPI and PETSc culminate in an independent project leading to an in-class report.

Instructors

The principal lecturer will be David Keyes, Professor of Applied Mathematics and Computational Science, KAUST. Guest lecturers will be invited to speak on their specialties. Lectures from Extreme Computing Research Center staff members highlighting open source scientific software will be incorporated into the course.

Goals and Syllabus

The overall goal is to acquaint students who anticipate doing independent work that may benefit from large-scale simulation with current hardware, software tools, practices, and trends in parallel scientific computing, and to provide an opportunity to build and execute sample parallel codes. The software employed in course examples is freely available. The course is also designed to make students intelligent consumers and critics of parallel scientific computing literature and conferences.

Much of the motivation for parallel scientific computing comes from simulations based on discretizations of partial differential equations (PDEs, typically described with sparse matrices), or integral equations (IEs, typically described with dense matrices), or based on interacting particles (unstructured interaction lists, often embedded in octtrees). Of course, many applications are nonlinear, but these are typically approached as a series of linearized analyses. An understanding of the underlying equations, their physical meaning, and their mathematical analysis is important in some parts of the course and opens up many possibilities for independent projects. Other material is easily abstracted away from its underlying operator equation context to that of a generic bulk-synchronous computation that interleaves flows of data with operations on that data. The intention is to provide a course of benefit to a broad clientele of graduate researchers. In addition to computer scientists and applied mathematicians, students from mechanical engineering, electrical engineering, chemical engineering, materials science, and geophysics should find it of interest and approachable if they already have sufficient background in computing to be motivated towards the high end.

Thirteen algorithmic prototypes that occur regularly in scientific computing have been identified in a famous 2006 Berkeley technical report “The Landscape of Parallel Computing Research: The View from Berkeley” (UCB/EECS-2006-183). Though ten years old, students may want to download and devour this report as representative of the motivation and flavor of the course. The Berkeley prototypes are: dense direct solvers, sparse direct solvers, spectral methods, N-body methods, structured grids / iterative solvers, unstructured grids / iterative solvers, Monte Carlo (including “MapReduce”), combinatorial logic, graph traversal, graphical models, finite state machines, dynamic programming, backtrack/branch-and-bound. The first seven are essential floating point kernels and the last six essential integer kernels. The course examines several of these kernels in detail.

Lecture Coverage Includes:

  • Introduction to large-scale predictive simulations: the combined culture of CS&E and HPC
  • Introduction to parallel architecture and programming models
  • Introduction to MPI, PETSc, and other software frameworks for HPC
  • Parallel algorithms for the solution of large, sparse linear systems and nonlinear systems with large, sparse Jacobians
  • Parallel algorithms for partial differential equations
  • Parallel algorithms for N-body particle dynamics

Evaluation and Grading

Evaluation consists of four components: problem sets, project, final exam, and class participation at the flipped local site. Problem sets may be undertaken cooperatively (and this is encouraged), but each student must submit the homework separately under their own name, vouching for their own responsibility for the answers. The quality of the write-up is part of the grade. It is intended that all students should be able to score well on the problem sets, because they will be announced well in advance of their due dates and students have unlimited time for their own reading and research of the topics consultations with one another. The problem sets should create an extended ongoing discussion for the class community. The project is intended to be individual. If students want to team to undertake a “bigger” project and earn the same grade for it, this should be negotiated when projects are launched in mid-course. Projects will be submitted in report form, and each project will be featured for a short presentation to the class at the end of the semester. The final exam is, of course, individual.

Frequently Asked Questions

Must I understand PDEs and Linear Algebra well to take this course?

Algorithms for partial differential equation and linear algebraic computations motivate this course and add knowledge of their mathematics adds substance to the parallel applications. However, the aspects of these subjects that are important to success in this course have to do with understanding the choreography of data and hardware. If you are comfortable with following the data in these algorithms without a theoretical understanding of how they approximate the real world (modeling) or how rapidly they converge to it (analysis), you can survive this course and even excel in it. Mathematical theorems, e.g., tying convergence of an iterative method to condition number of a matrix have a quality of subroutines: if the upstream hypotheses (inputs) are verified, the consequences (outputs) may be chained into downstream uses in this course, e.g., complexity analyses.

Must I be facile in Unix and C/C++ to take this course?

In this course, you will work with sample applications written in C and you will build and execute on Linux-based distributed systems. One can pick up what one needs without being an expert in the tools applied.

Do you have a motto for success in difficult endeavors like high performance computing?

Actually, this is not a frequently asked question, but it should be. I do have a motto, taken from the most successful college football coach in history, Bear Bryant (1913—1983), as measured by the number of career wins amassed: “It’s not the will to win, but the will to prepare to win that makes the difference.”

Source: Ohio Supercomputer Center

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Digging into the Atos-Nimbix Deal: Big US HPC and Global Cloud Aspirations. Look out HPE?

August 2, 2021

Behind Atos’s deal announced last week to acquire HPC-cloud specialist Nimbix are ramped-up plans to penetrate the U.S. HPC market and global expansion of its HPC cloud capabilities. Nimbix will become “an Atos HPC c Read more…

Berkeley Lab Makes Strides in Autonomous Discovery to Tackle the Data Deluge

August 2, 2021

Data production is outpacing the human capacity to process said data. Whether a giant radio telescope, a new particle accelerator or lidar data from autonomous cars, the sheer scale of the data generated is increasingly Read more…

Verifying the Universe with Exascale Computers

July 30, 2021

The ExaSky project, one of the critical Earth and Space Science applications being solved by the US Department of Energy’s (DOE’s) Exascale Computing Project (ECP), is preparing to use the nation’s forthcoming exas Read more…

What’s After Exascale? The Internet of Workflows Says HPE’s Nicolas Dubé

July 29, 2021

With the race to exascale computing in its final leg, it’s natural to wonder what the Post Exascale Era will look like. Nicolas Dubé, VP and chief technologist for HPE’s HPC business unit, agrees and shared his vision at Supercomputing Frontiers Europe 2021 held last week. The next big thing, he told the virtual audience at SFE21, is something that will connect HPC and (broadly) all of IT – into what Dubé calls The Internet of Workflows. Read more…

How UK Scientists Developed Transformative, HPC-Powered Coronavirus Sequencing System

July 29, 2021

In November 2020, the COVID-19 Genomics UK Consortium (COG-UK) won the HPCwire Readers’ Choice Award for Best HPC Collaboration for its CLIMB-COVID sequencing project. Launched in March 2020, CLIMB-COVID has now resulted in the sequencing of over 675,000 coronavirus genomes – an increasingly critical task as variants like Delta threaten the tenuous prospect of a return to normalcy in much of the world. Read more…

AWS Solution Channel

Data compression with increased performance and lower costs

Many customers associate a performance cost with data compression, but that’s not the case with Amazon FSx for Lustre. With FSx for Lustre, data compression reduces storage costs and increases aggregate file system throughput. Read more…

KAUST Leverages Mixed Precision for Geospatial Data

July 28, 2021

For many computationally intensive tasks, exacting precision is not necessary for every step of the entire task to obtain a suitably precise result. The alternative is mixed-precision computing: using high precision wher Read more…

Digging into the Atos-Nimbix Deal: Big US HPC and Global Cloud Aspirations. Look out HPE?

August 2, 2021

Behind Atos’s deal announced last week to acquire HPC-cloud specialist Nimbix are ramped-up plans to penetrate the U.S. HPC market and global expansion of its Read more…

How UK Scientists Developed Transformative, HPC-Powered Coronavirus Sequencing System

July 29, 2021

In November 2020, the COVID-19 Genomics UK Consortium (COG-UK) won the HPCwire Readers’ Choice Award for Best HPC Collaboration for its CLIMB-COVID sequencing project. Launched in March 2020, CLIMB-COVID has now resulted in the sequencing of over 675,000 coronavirus genomes – an increasingly critical task as variants like Delta threaten the tenuous prospect of a return to normalcy in much of the world. Read more…

What’s After Exascale? The Internet of Workflows Says HPE’s Nicolas Dubé

July 29, 2021

With the race to exascale computing in its final leg, it’s natural to wonder what the Post Exascale Era will look like. Nicolas Dubé, VP and chief technologist for HPE’s HPC business unit, agrees and shared his vision at Supercomputing Frontiers Europe 2021 held last week. The next big thing, he told the virtual audience at SFE21, is something that will connect HPC and (broadly) all of IT – into what Dubé calls The Internet of Workflows. Read more…

IBM and University of Tokyo Roll Out Quantum System One in Japan

July 27, 2021

IBM and the University of Tokyo today unveiled an IBM Quantum System One as part of the IBM-Japan quantum program announced in 2019. The system is the second IB Read more…

Intel Unveils New Node Names; Sapphire Rapids Is Now an ‘Intel 7’ CPU

July 27, 2021

What's a preeminent chip company to do when its process node technology lags the competition by (roughly) one generation, but outmoded naming conventions make it seem like it's two nodes behind? For Intel, the response was to change how it refers to its nodes with the aim of better reflecting its positioning within the leadership semiconductor manufacturing space. Intel revealed its new node nomenclature, and... Read more…

Will Approximation Drive Post-Moore’s Law HPC Gains?

July 26, 2021

“Hardware-based improvements are going to get more and more difficult,” said Neil Thompson, an innovation scholar at MIT’s Computer Science and Artificial Intelligence Lab (CSAIL). “I think that’s something that this crowd will probably, actually, be already familiar with.” Thompson, speaking... Read more…

With New Owner and New Roadmap, an Independent Omni-Path Is Staging a Comeback

July 23, 2021

Put on a shelf by Intel in 2019, Omni-Path faced a uncertain future, but under new custodian Cornelis Networks, OmniPath is looking to make a comeback as an independent high-performance interconnect solution. A "significant refresh" – called Omni-Path Express – is coming later this year according to the company. Cornelis Networks formed last September as a spinout of Intel's Omni-Path division. Read more…

Chameleon’s HPC Testbed Sharpens Its Edge, Presses ‘Replay’

July 22, 2021

“One way of saying what I do for a living is to say that I develop scientific instruments,” said Kate Keahey, a senior fellow at the University of Chicago a Read more…

AMD Chipmaker TSMC to Use AMD Chips for Chipmaking

May 8, 2021

TSMC has tapped AMD to support its major manufacturing and R&D workloads. AMD will provide its Epyc Rome 7702P CPUs – with 64 cores operating at a base cl Read more…

Intel Launches 10nm ‘Ice Lake’ Datacenter CPU with Up to 40 Cores

April 6, 2021

The wait is over. Today Intel officially launched its 10nm datacenter CPU, the third-generation Intel Xeon Scalable processor, codenamed Ice Lake. With up to 40 Read more…

Berkeley Lab Debuts Perlmutter, World’s Fastest AI Supercomputer

May 27, 2021

A ribbon-cutting ceremony held virtually at Berkeley Lab's National Energy Research Scientific Computing Center (NERSC) today marked the official launch of Perlmutter – aka NERSC-9 – the GPU-accelerated supercomputer built by HPE in partnership with Nvidia and AMD. Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer called Dojo to process truly vast amounts of video data. It’s a beast! … A truly useful exaflop at de facto FP32.” Read more…

Google Launches TPU v4 AI Chips

May 20, 2021

Google CEO Sundar Pichai spoke for only one minute and 42 seconds about the company’s latest TPU v4 Tensor Processing Units during his keynote at the Google I Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months after Red Hat deprecated its support for the widely popular, free CentOS server operating system. The Rocky Linux development effort... Read more…

Iran Gains HPC Capabilities with Launch of ‘Simorgh’ Supercomputer

May 18, 2021

Iran is said to be developing domestic supercomputing technology to advance the processing of scientific, economic, political and military data, and to strengthen the nation’s position in the age of AI and big data. On Sunday, Iran unveiled the Simorgh supercomputer, which will deliver.... Read more…

HPE Launches Storage Line Loaded with IBM’s Spectrum Scale File System

April 6, 2021

HPE today launched a new family of storage solutions bundled with IBM’s Spectrum Scale Erasure Code Edition parallel file system (description below) and featu Read more…

Leading Solution Providers

Contributors

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

GTC21: Nvidia Launches cuQuantum; Dips a Toe in Quantum Computing

April 13, 2021

Yesterday Nvidia officially dipped a toe into quantum computing with the launch of cuQuantum SDK, a development platform for simulating quantum circuits on GPU-accelerated systems. As Nvidia CEO Jensen Huang emphasized in his keynote, Nvidia doesn’t plan to build... Read more…

Microsoft to Provide World’s Most Powerful Weather & Climate Supercomputer for UK’s Met Office

April 22, 2021

More than 14 months ago, the UK government announced plans to invest £1.2 billion ($1.56 billion) into weather and climate supercomputing, including procuremen Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

Q&A with Jim Keller, CTO of Tenstorrent, and an HPCwire Person to Watch in 2021

April 22, 2021

As part of our HPCwire Person to Watch series, we are happy to present our interview with Jim Keller, president and chief technology officer of Tenstorrent. One of the top chip architects of our time, Keller has had an impactful career. Read more…

AMD-Xilinx Deal Gains UK, EU Approvals — China’s Decision Still Pending

July 1, 2021

AMD’s planned acquisition of FPGA maker Xilinx is now in the hands of Chinese regulators after needed antitrust approvals for the $35 billion deal were receiv Read more…

Senate Debate on Bill to Remake NSF – the Endless Frontier Act – Begins

May 18, 2021

The U.S. Senate today opened floor debate on the Endless Frontier Act which seeks to remake and expand the National Science Foundation by creating a technology Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire