Simulating 44-Qubit quantum circuits using AWS ParallelCluster

By Amazon Web Services

September 20, 2022

Dr. Fabio Baruffa, Sr. HPC & QC Solutions Architect
Dr. Pavel Lougovski, Pr. QC Research Scientist
Tyson Jones, Doctoral researcher, University of Oxford

Introduction

Currently, an enormous effort is underway to develop quantum computing hardware capable of scaling to hundreds, thousands, and even millions of physical (non-error-corrected) qubits. Ultimately, this is to build fault-tolerant quantum computers. Classically simulating the behavior of systems with a large number of qubits is a key to understanding the behavior of physical quantum systems under varying noise conditions as they scale.

Simulations are also invaluable to understand the noise resilience of quantum algorithms. Because the noise characteristics of today’s hardware prototypes often defy analytic treatment, they are instead investigated through small-scale experiments and intensive numerical modelling. Even performance evaluations of perfect noise-free quantum algorithms typically require some form of classical emulation.

Unsurprisingly, such emulation tasks are computationally demanding and memory intensive, so the researchers must use high performance computing (HPC) strategies like data and algorithm distribution when modelling even modestly-sized present-day quantum experiments. HPC simulators of quantum computers are therefore an indispensable tool in the advancement of experimental and algorithmic research.

In this blog post, we describe how to perform large-scale quantum circuits simulations using AWS ParallelCluster with QuEST, the Quantum Exact Simulation Toolkit. We demonstrate a simple and rapid deployment of computational resources up to 4,096 compute instances to simulate random quantum circuits with up to 44 qubits.

Prerequisites

Quantum computing has the potential to accelerate current computation capabilities using the principles of quantum physics, and possibly solve specific complex problems that are difficult to address with conventional computers. This is a major area of research field, where new hardware and software needs to be developed. Currently, a crucial role is played by classical simulations of quantum computers for demonstrating and proofing new ideas and experimenting before a production environment is developed.

Classic simulations

Quantum computers can be classically simulated using a variety of algorithmic paradigms, each with their own costs and performance trade-offs. The choice of the simulation algorithm is often determined by the nature of the questions asked about the emulated quantum device, such as the probability of a particular error occurring, or the expected value of an observable. We will introduce two ubiquitous paradigms: state-vector (SV) and tensor-network (TN) simulation.

SV simulators, also known as “full-state”, “brute-force” and “Schrödinger-style” simulators, maintain a complete numerical description of the evolving quantum state of a quantum circuit. As such, they require memory that scales exponentially with the number of qubits in the circuit, but their runtime scales linearly with the quantum circuit depth. Since their complete quantum state output permits the precise and efficient a posteriori calculation of any property, they are the conventional first choice of simulator for much of quantum computing research.

In contrast, TN simulators have constant growing memory requirements as the number of qubits increases. TN simulators are exponentially slowed by deepening circuits and increasing state complexity. This makes them cheaper and faster in the study of shallow circuits with a suitable structure, and the simulation can potentially scale to many qubits.

The performance bottleneck of SV simulators is the propagation of a quantum state, while for TN simulators, it is the propagation of a particular observable. QuEST is a SV simulator and in this blog post, we will employ it for the study of circuits for which SV simulation is particularly well suited

In a State-vector (SV) simulation, an N-qubit register is represented by a state-vector of 2complex amplitudes and can be numerically instantiated as an array of 2×2N real floating-point numbers. SV simulation of N=40 qubits at double precision would therefore require 16,384 GiB, well beyond the capacity of a typical HPC compute node. This makes the use of distributed memory systems essential. To date, large-scale SV simulations were performed exclusively on purpose-built supercomputers and required a long lead time just to allocate the resources.

AWS resources

If you are interested in simulating small to moderately-sized quantum circuits, Amazon Braket offers the choice of several simulators. These include the local simulator that is included in the Braket SDK and three on-demand simulators. The local simulator can run on a laptop or within an Braket managed notebook and supports simulation of quantum circuits with and without noise

The on-demand simulators are SV1, a general-purpose state vector simulator; DM1, a density matrix simulator that supports noise modeling; and TN1, a tensor network simulator that specializes in certain larger scale structured quantum circuits. SV1 is suitable for circuits up to 34 qubits, and DM1 supports the simulation of circuits up to 17 qubits. While TN1 can simulate up to 50 qubits, it can be used only for suitably structured quantum circuits. This blog complements the Braket simulators by exploring the scalability of larger SV simulation circuits with up to 44 qubits using the QuEST simulator on Amazon Elastic Compute Cloud (Amazon EC2).

Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. Amazon EC2 compute-optimized instances are ideal for compute bound workloads and intensive numerical modeling. For example, 256 c5.18xlarge (144 GiB of memory) instances would together contain sufficient memory to store the distributed state-vector for a 40-qubit circuit, including the doubled memory costs of storing the necessary auxiliary buffers for MPI communication. Of course, simulating just an additional qubit will double the total memory requirement. Simulation of an N=44 qubit register requires 562,950 GiB (~0.5 PiB) of memory or 4,096 c5.18xlarge instances.

To orchestrate your compute resources, AWS developed an open-source cluster management tool, AWS ParallelCluster, which simplifies deploying and managing HPC clusters on AWS. AWS ParallelCluster enables the rapid deployment of virtual clusters with varying architectures to meet the requirements of different applications and workflows. You can also run your computation immediately when needed without waiting in a queue for a shared compute resource. As a result, many scientists and companies worldwide are looking to use cloud computing to find solutions to their problems in an efficient and cost-effective manner.

The remainder of this blog post demonstrates an HPC deployment of QuEST with AWS ParallelCluster to simulate random circuits. Random circuits appear both in the verification of real quantum computers and in the performance benchmarking of quantum computing simulations.

Circuit Details

We use QuEST to simulate a generic quantum circuit in a distributed memory system. We sample the probability distribution over N-bit strings produced by N-qubit circuits using one- and two-qubit gates and multi-qubit controlled gates. We implemented a set of random N-qubit quantum circuit using the following algorithm:

  • Set the total number of qubits N and gates Gn in a circuit
  • Looping for each gate in Gn:
    • toss an unbiased coin
    • if the outcome of the coin toss is heads:
      • choose two indices (q1, q2) randomly, each from 1 to N
      • apply two-qubit CZ gate between qubits q1 and q2
    • if the outcome of the coin toss is tails:
      • choose an index q1 randomly from 1 to N
      • choose a single qubit gate G from {RX, RY, RZ, H} uniformly at random
      • if G is H:
        • apply H to qubit q1
      • if G is RX, RY, or RZ
        • choose a random number θ between 0 and pi (3.1415…)
        • apply the corresponding rotation by the angle θ to the qubit q1

The single qubit gates RX, RY, RZ are the rotation gate along the respectively axis and the H is the Hadamard gate. The two-qubit gate CZ is the controlled phase flip.

The randomness of the circuits prevents particular symmetries being explored to optimize the classical simulation.

We run circuit simulations in QuEST by iterating over the number of qubits, starting from N=40 to N=44 and using the following number of gates Gn​=(100, 200, 400, 600, 800, 1000) for each value of N. We always initialize the quantum state of the circuit to ∣0⟩N and compute 2N complex amplitudes of the final state after the random circuit is applied to the initial state. Because SV simulations are implemented as a sequence of Gn​ matrix-vector multiplications, we estimate the total number of floating-point operations (FLOP) complexity of simulating a single complex amplitude in the final state vector by recording the number of elementary multiplication and addition operations and dividing them by the total number of amplitudes (2N).

Circuit Complexity

The computational complexity of simulating a random N-qubit circuit using an SV simulator, such as QuEST, grows exponentially with N but scales linearly with the number of single- and two-bit gates Gn​. In other words, the computational cost does not discriminate between different circuit structures.

Other simulation approaches, such as tensor network (TN) simulations, are much more sensitive to random circuit structure. TN simulators do not compute an entire N-qubit state vector but rather can find an optimal contraction path for estimating a single amplitude in the state vector. Many amplitudes in a state vector generated by a random circuit can be 0 and do not need to be evaluated explicitly and TN simulators can help identify amplitudes for which this holds.

However, random circuits with circuit depth greater than 400 gates incur a large computational cost per amplitude that grows polynomially with the circuit depth. These circuits are better suited for SV simulations where simulation cost grows linearly with the depth.

Resources deployment

We demonstrate large-scale simulations of quantum circuits using QuEST, an open-source quantum state vector simulator. QuEST can run multithread and distributed calculations using MPI/OpenMP to accelerate simulations on HPC systems. The HPC infrastructure is deployed using AWS ParallelCluster. The following diagram shows the HPC architecture.

Figure 1: HPC Architecture

The Head Node is used to log in to the cluster, compile the application, submit the job, and set up Compute Nodes, which are dynamically provisioned according to the size of the problem (number of qubits).

We use the EC2 c5.18xlarge compute-optimized instances with Intel Xeon Scalable Processors with a sustained all core Turbo frequency of 3.4GHz. The instances are equipped with 36 cores and 144 GiB of memory per node, which gives the best compromise between resources required for the circuit and performance. The memory-per-core ratio is 4 GiB, which allows for an efficient usage of 2 MPI tasks per instance. The following table shows the required resources for simulations with 36 to 44 qubits.

Number of qubitsMemory Required (GiB)Number of instancesTotal available memory EC2 (GiB)Total number of cores

Number of qubitsMemory Required (GiB)Number of instancesTotal available memory EC2 (GiB)Total number of cores

36 2,199 16 2,304 576
37 4,398 32 4,608 1152
38 8,796 64 9,216 2,304
39 17,592 128 18,432 4,608
40 35,184 256 36,864 9,216
41 70,369 512 73,728 18,432
42 140,737 1,024 147,456 36,864
43 281,475 2,048 294,912 73,728
44 562,950 4,096 589,824 147,456

Table 1: Resources required by the state vector simulator to simulate a circuit with the given number of qubits.

We compiled QuEST version 3.5.0 from source code with the Intel OneAPI HPC toolkit, version 2022.2, to take advantage of the performance optimization provided by the AVX512 and AVX2 vector instructions available on C5 instances. We used Amazon Linux 2 for the operating system, and the Intel OneAPI MPI 2022.2 for the network library.

Performance results

We explore the scalability of the simulation with respect to the number of instances. Adding one additional qubit doubles the memory requirements and the number of instances required by the state vector simulator. In all experiments, we use 2 MPI tasks per instance with 18 OMP threads, and we disable hyperthreading…

Read the full blog to learn more. Reminder: You can learn a lot from AWS HPC engineers by subscribing to the HPC Tech Short YouTube channel, and following the AWS HPC Blog channel.

 

Return to Solution Channel Homepage
Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Nvidia Shuts Out RISC-V Software Support for GPUs 

September 23, 2022

Nvidia is not interested in bringing software support to its GPUs for the RISC-V architecture despite being an early adopter of the open-source technology in its GPU controllers. Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative... Read more…

Microsoft Closes Confidential Computing Loop with AMD’s Milan Chip

September 22, 2022

Microsoft shared details on how it uses an AMD technology to secure artificial intelligence as it builds out a secure AI infrastructure in its Azure cloud service. Microsoft has a strong relationship with Nvidia, but is also working with AMD's Epyc chips (including the new 3D VCache series), MI Instinct accelerators, and also... Read more…

Nvidia Introduces New Ada Lovelace GPU Architecture, OVX Systems, Omniverse Cloud

September 20, 2022

In his GTC keynote today, Nvidia CEO Jensen Huang launched another new Nvidia GPU architecture: Ada Lovelace, named for the legendary mathematician regarded as the first computer programmer. The company also announced tw Read more…

Nvidia’s Hopper GPUs Enter ‘Full Production,’ DGXs Delayed Until Q1

September 20, 2022

Just about six months ago, Nvidia’s spring GTC event saw the announcement of its hotly anticipated Hopper GPU architecture. Now, the GPU giant is announcing that Hopper-generation GPUs (which promise greater energy eff Read more…

NeMo LLM Service: Nvidia’s First Cloud Service Makes AI Less Vague

September 20, 2022

Nvidia is trying to uncomplicate AI with a cloud service that makes AI and its many forms of computing less vague and more conversational. The NeMo LLM service, which Nvidia called its first cloud service, adds a layer of intelligence and interactivity... Read more…

AWS Solution Channel

Shutterstock 1194728515

Simulating 44-Qubit quantum circuits using AWS ParallelCluster

Dr. Fabio Baruffa, Sr. HPC & QC Solutions Architect
Dr. Pavel Lougovski, Pr. QC Research Scientist
Tyson Jones, Doctoral researcher, University of Oxford

Introduction

Currently, an enormous effort is underway to develop quantum computing hardware capable of scaling to hundreds, thousands, and even millions of physical (non-error-corrected) qubits. Read more…

Microsoft/NVIDIA Solution Channel

Shutterstock 1166887495

Improving Insurance Fraud Detection using AI Running on Cloud-based GPU-Accelerated Systems

Insurance is a highly regulated industry that is evolving as the industry faces changing customer expectations, massive amounts of data, and increased regulations. A major issue facing the industry is tracking insurance fraud. Read more…

Nvidia Targets Computers for Robots in the Surgery Rooms

September 20, 2022

Nvidia is laying the groundwork for a future in which humans and robots will be collaborators in the surgery rooms at hospitals. The company announced a computer called IGX for Medical Devices, which will be populated in robots, image scanners and other computers and medical devices involved in patient care close to the point... Read more…

Nvidia Shuts Out RISC-V Software Support for GPUs 

September 23, 2022

Nvidia is not interested in bringing software support to its GPUs for the RISC-V architecture despite being an early adopter of the open-source technology in its GPU controllers. Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative... Read more…

Nvidia Introduces New Ada Lovelace GPU Architecture, OVX Systems, Omniverse Cloud

September 20, 2022

In his GTC keynote today, Nvidia CEO Jensen Huang launched another new Nvidia GPU architecture: Ada Lovelace, named for the legendary mathematician regarded as Read more…

Nvidia’s Hopper GPUs Enter ‘Full Production,’ DGXs Delayed Until Q1

September 20, 2022

Just about six months ago, Nvidia’s spring GTC event saw the announcement of its hotly anticipated Hopper GPU architecture. Now, the GPU giant is announcing t Read more…

NeMo LLM Service: Nvidia’s First Cloud Service Makes AI Less Vague

September 20, 2022

Nvidia is trying to uncomplicate AI with a cloud service that makes AI and its many forms of computing less vague and more conversational. The NeMo LLM service, which Nvidia called its first cloud service, adds a layer of intelligence and interactivity... Read more…

Nvidia Targets Computers for Robots in the Surgery Rooms

September 20, 2022

Nvidia is laying the groundwork for a future in which humans and robots will be collaborators in the surgery rooms at hospitals. The company announced a computer called IGX for Medical Devices, which will be populated in robots, image scanners and other computers and medical devices involved in patient care close to the point... Read more…

Survey Results: PsiQuantum, ORNL, and D-Wave Tackle Benchmarking, Networking, and More

September 19, 2022

The are many issues in quantum computing today – among the more pressing are benchmarking, networking and development of hybrid classical-quantum approaches. Read more…

HPC + AI Wall Street to Feature ‘Spooky’ Science for Financial Services

September 18, 2022

Albert Einstein famously described quantum mechanics as "spooky action at a distance" due to the non-intuitive nature of superposition and quantum entangled par Read more…

Analog Chips Find a New Lease of Life in Artificial Intelligence

September 17, 2022

The need for speed is a hot topic among participants at this week’s AI Hardware Summit – larger AI language models, faster chips and more bandwidth for AI machines to make accurate predictions. But some hardware startups are taking a throwback approach for AI computing to counter the more-is-better... Read more…

Nvidia Shuts Out RISC-V Software Support for GPUs 

September 23, 2022

Nvidia is not interested in bringing software support to its GPUs for the RISC-V architecture despite being an early adopter of the open-source technology in its GPU controllers. Nvidia has no plans to add RISC-V support for CUDA, which is the proprietary GPU software platform, a company representative... Read more…

AWS Takes the Short and Long View of Quantum Computing

August 30, 2022

It is perhaps not surprising that the big cloud providers – a poor term really – have jumped into quantum computing. Amazon, Microsoft Azure, Google, and th Read more…

The Final Frontier: US Has Its First Exascale Supercomputer

May 30, 2022

In April 2018, the U.S. Department of Energy announced plans to procure a trio of exascale supercomputers at a total cost of up to $1.8 billion dollars. Over the ensuing four years, many announcements were made, many deadlines were missed, and a pandemic threw the world into disarray. Now, at long last, HPE and Oak Ridge National Laboratory (ORNL) have announced that the first of those... Read more…

US Senate Passes CHIPS Act Temperature Check, but Challenges Linger

July 19, 2022

The U.S. Senate on Tuesday passed a major hurdle that will open up close to $52 billion in grants for the semiconductor industry to boost manufacturing, supply chain and research and development. U.S. senators voted 64-34 in favor of advancing the CHIPS Act, which sets the stage for the final consideration... Read more…

Top500: Exascale Is Officially Here with Debut of Frontier

May 30, 2022

The 59th installment of the Top500 list, issued today from ISC 2022 in Hamburg, Germany, officially marks a new era in supercomputing with the debut of the first-ever exascale system on the list. Frontier, deployed at the Department of Energy’s Oak Ridge National Laboratory, achieved 1.102 exaflops in its fastest High Performance Linpack run, which was completed... Read more…

Chinese Startup Biren Details BR100 GPU

August 22, 2022

Amid the high-performance GPU turf tussle between AMD and Nvidia (and soon, Intel), a new, China-based player is emerging: Biren Technology, founded in 2019 and headquartered in Shanghai. At Hot Chips 34, Biren co-founder and president Lingjie Xu and Biren CTO Mike Hong took the (virtual) stage to detail the company’s inaugural product: the Biren BR100 general-purpose GPU (GPGPU). “It is my honor to present... Read more…

Newly-Observed Higgs Mode Holds Promise in Quantum Computing

June 8, 2022

The first-ever appearance of a previously undetectable quantum excitation known as the axial Higgs mode – exciting in its own right – also holds promise for developing and manipulating higher temperature quantum materials... Read more…

AMD’s MI300 APUs to Power Exascale El Capitan Supercomputer

June 21, 2022

Additional details of the architecture of the exascale El Capitan supercomputer were disclosed today by Lawrence Livermore National Laboratory’s (LLNL) Terri Read more…

Leading Solution Providers

Contributors

Tesla Bulks Up Its GPU-Powered AI Super – Is Dojo Next?

August 16, 2022

Tesla has revealed that its biggest in-house AI supercomputer – which we wrote about last year – now has a total of 7,360 A100 GPUs, a nearly 28 percent uplift from its previous total of 5,760 GPUs. That’s enough GPU oomph for a top seven spot on the Top500, although the tech company best known for its electric vehicles has not publicly benchmarked the system. If it had, it would... Read more…

Exclusive Inside Look at First US Exascale Supercomputer

July 1, 2022

HPCwire takes you inside the Frontier datacenter at DOE's Oak Ridge National Laboratory (ORNL) in Oak Ridge, Tenn., for an interview with Frontier Project Direc Read more…

AMD Opens Up Chip Design to the Outside for Custom Future

June 15, 2022

AMD is getting personal with chips as it sets sail to make products more to the liking of its customers. The chipmaker detailed a modular chip future in which customers can mix and match non-AMD processors in a custom chip package. "We are focused on making it easier to implement chips with more flexibility," said Mark Papermaster, chief technology officer at AMD during the analyst day meeting late last week. Read more…

Intel Reiterates Plans to Merge CPU, GPU High-performance Chip Roadmaps

May 31, 2022

Intel reiterated it is well on its way to merging its roadmap of high-performance CPUs and GPUs as it shifts over to newer manufacturing processes and packaging technologies in the coming years. The company is merging the CPU and GPU lineups into a chip (codenamed Falcon Shores) which Intel has dubbed an XPU. Falcon Shores... Read more…

Nvidia, Intel to Power Atos-Built MareNostrum 5 Supercomputer

June 16, 2022

The long-troubled, hotly anticipated MareNostrum 5 supercomputer finally has a vendor: Atos, which will be supplying a system that includes both Nvidia and Inte Read more…

UCIe Consortium Incorporates, Nvidia and Alibaba Round Out Board

August 2, 2022

The Universal Chiplet Interconnect Express (UCIe) consortium is moving ahead with its effort to standardize a universal interconnect at the package level. The c Read more…

Using Exascale Supercomputers to Make Clean Fusion Energy Possible

September 2, 2022

Fusion, the nuclear reaction that powers the Sun and the stars, has incredible potential as a source of safe, carbon-free and essentially limitless energy. But Read more…

Is Time Running Out for Compromise on America COMPETES/USICA Act?

June 22, 2022

You may recall that efforts proposed in 2020 to remake the National Science Foundation (Endless Frontier Act) have since expanded and morphed into two gigantic bills, the America COMPETES Act in the U.S. House of Representatives and the U.S. Innovation and Competition Act in the U.S. Senate. So far, efforts to reconcile the two pieces of legislation have snagged and recent reports... Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire