Try Before You Buy? Test Driving a Supercomputer System

By Alex Nodeland, CEO, Archanan

October 7, 2019

In a recent HPCwire article, it was revealed that DARPA is working to optimize programming approaches with the goal of increasing the performance of parallel systems. This is a worthwhile goal, and one that is squarely inline with our vision at Archanan, where we have developed a cloud platform to help speed research and development cycles by providing tools and environments that enable programmers to develop and test applications in real-time, at scale. Our goal is to help maximize the organizational utility of any existing supercomputer (and/or other complex computing systems), while speeding up the tendering and procurement process for system vendors by allowing engineers to develop and test applications on a virtualized model of the future system.

Interestingly enough, the DARPA article notes that “one possible approach to more efficient development of executable HPC code would be accurate modeling and prediction of component performance within a full-blown HPC platform.” As fate would have it, this is exactly what we have developed at Archanan, and are currently rolling out at supercomputing centers across Asia.

We have developed a cloud-based platform in which an organization is able to administer a digital twin of their supercomputing system, emulating every component; from the storage and memory, down to the compute and fabric, thus enabling development and testing of an at scale system without tying up the production system itself. Using the Archanan Development Cloud, organizations are able to administer personal Integrated Development Environments (IDEs) in the Archanan Cloud that mimics their own system. This helps to create new, efficient workflows that eliminate testing bottlenecks and port-over failures associated with not being able to pre-test code at scale.

Through our rich background in supercomputing with several institutions, we have worked with many people in different roles across the high-performance computing community. We consistently hear about issues HPC developers are having with their workflows and are keenly aware that it is a very difficult challenge for an organization to change their development track after it has been deployed. The frustration always come down to the same challenges: over-subscribed test systems that aren’t up to scale with the production machine.

Our mission is to change this paradigm by adding value at the beginning of the lifecycle for a supercomputer by working with hardware manufacturers to provide emulation of their upcoming architectures. They, in turn, can share this virtualized hardware in the Archanan Development Cloud with their customers, thus providing a “test drive” of the system to help provide better estimates for the performance of the system and its elements during the tendering process. Imagine a research center being able to run their top five applications on a system during the tendering process, while making adjustments to the system to right-size its performance to match its application needs. This “at scale” test drive ability has previously been unavailable, but today, there is no reason for any organization to commit financial resources to these expensive systems without first giving them a thorough examination using cloud emulation.

This resource comes at an ideal time in the advancement of supercomputing systems as we see increasing numbers of hybrid machines and specialized, advanced applications like AI, where specific accelerators are being considered. In these cases, it’s very difficult to predict performance when you are working across many different types of hardware. We’ve seen many supercomputing centers either over-provisioning or under-provisioning particular hardware components of the larger system. This, of course, is largely dependent on the applications that are being run, and at what capacities, making it critical to be able to test-drive before committing to a system.

We’re also seeing an increasing number of machines with many processor architectures – multiple CPU architectures (Power, x86, ARM, etc.), accelerated by multiple accelerators (GPU, FPGA, etc.). Previously, it was very difficult to reliably gauge the performance of such a system, but today, we can provide a snapshot of the whole machine, providing accurate benchmarking while sampling it against the applications intended to be run on it.

The best part is that this ability is a single facet to the overall power of the Archanan Development Cloud. Once a system is requisitioned with specs fully determined, it may take upwards of two years before the purchasing organization will take custody of that system. Under the current paradigm, committing resources for development on that system is precarious because there is no way to accurately test the performance and portability of the applications being developed. However, with virtualized access to the machine, at-scale development can happen immediately. When an organization’s users have access to an emulated version of their future machine, production applications can be installed and ran as soon as the power is switched on. Simply put, the supercomputer can reach effectiveness more quickly if people can develop and optimize their applications at scale before the machine is delivered.

Additional possibilities exist as well. For organizations such as universities, where current access to production machines is very limited, independent virtualized clones of their system can be made available on an individual, account level basis. A university can feel less restricted in giving their students access to learn, explore, and experiment. Graduate students, undergrads, and anyone learning large-scale or parallel computing can have access to systems that look like the full machine. They can demonstrate production scale workloads and prepare their projects for a better chance at deployment on the physical machine. Virtualizing the production machine lowers the bar for access to it, while increasing the system’s value and effectiveness.

Users of Archanan will change their supercomputing processes for the better by lowering risk, eliminating bottlenecks and maximizing the utility of these valuable systems. We encourage any organization purchasing or building a supercomputing system to get in touch to discuss how we can help. For more information, please visit us at archanan.io, or download our solution brief.

Alex Nodeland is the CEO and Co-founder of Archanan.

 

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

NASA Uses Supercomputing to Measure Carbon in the World’s Trees

October 22, 2020

Trees constitute one of the world’s most important carbon sinks, pulling enormous amounts of carbon dioxide from the atmosphere and storing the carbon in their trunks and the surrounding soil. Measuring this carbon sto Read more…

By Oliver Peckham

Nvidia Dominates (Again) Latest MLPerf Inference Results

October 22, 2020

The two-year-old AI benchmarking group MLPerf.org released its second set of inferencing results yesterday and again, as in the most recent MLPerf training results (July 2020), it was almost entirely The Nvidia Show, a p Read more…

By John Russell

With Optane Gaining, Intel Exits NAND Flash

October 21, 2020

In a sign that its 3D XPoint memory technology is gaining traction, Intel Corp. is departing the NAND flash memory and storage market with the sale of its manufacturing base in China to SK Hynix of South Korea. The $9 Read more…

By George Leopold

HPE, AMD and EuroHPC Partner for Pre-Exascale LUMI Supercomputer

October 21, 2020

Not even a week after Nvidia announced that it would be providing hardware for the first four of the eight planned EuroHPC systems, HPE and AMD are announcing another major EuroHPC design win. Finnish supercomputing cent Read more…

By Oliver Peckham

HPE to Build Australia’s Most Powerful Supercomputer for Pawsey

October 20, 2020

The Pawsey Supercomputing Centre in Perth, Western Australia, has had a busy year. Pawsey typically spends much of its time looking to the stars, working with a variety of observatories and astronomers – but when COVID Read more…

By Oliver Peckham

AWS Solution Channel

Live Webinar: AWS & Intel Research Webinar Series – Fast scaling research workloads on the cloud

Date: 27 Oct – 5 Nov

Join us for the AWS and Intel Research Webinar series.

You will learn how we help researchers process complex workloads, quickly analyze massive data pipelines, store petabytes of data, and advance research using transformative technologies. Read more…

Intel® HPC + AI Pavilion

Berlin Institute of Health: Putting HPC to Work for the World

Researchers from the Center for Digital Health at the Berlin Institute of Health (BIH) are using science to understand the pathophysiology of COVID-19, which can help to inform the development of targeted treatments. Read more…

DDN-Tintri Showcases Technology Integration with Two New Products

October 20, 2020

DDN, a long-time leader in HPC storage, announced two new products today and provided more detail around its strategy for integrating DDN HPC technologies with the enterprise strengths of its recent acquisitions, notably Read more…

By John Russell

Nvidia Dominates (Again) Latest MLPerf Inference Results

October 22, 2020

The two-year-old AI benchmarking group MLPerf.org released its second set of inferencing results yesterday and again, as in the most recent MLPerf training resu Read more…

By John Russell

HPE, AMD and EuroHPC Partner for Pre-Exascale LUMI Supercomputer

October 21, 2020

Not even a week after Nvidia announced that it would be providing hardware for the first four of the eight planned EuroHPC systems, HPE and AMD are announcing a Read more…

By Oliver Peckham

HPE to Build Australia’s Most Powerful Supercomputer for Pawsey

October 20, 2020

The Pawsey Supercomputing Centre in Perth, Western Australia, has had a busy year. Pawsey typically spends much of its time looking to the stars, working with a Read more…

By Oliver Peckham

DDN-Tintri Showcases Technology Integration with Two New Products

October 20, 2020

DDN, a long-time leader in HPC storage, announced two new products today and provided more detail around its strategy for integrating DDN HPC technologies with Read more…

By John Russell

Is the Nvidia A100 GPU Performance Worth a Hardware Upgrade?

October 16, 2020

Over the last decade, accelerators have seen an increasing rate of adoption in high-performance computing (HPC) platforms, and in the June 2020 Top500 list, eig Read more…

By Hartwig Anzt, Ahmad Abdelfattah and Jack Dongarra

Nvidia and EuroHPC Team for Four Supercomputers, Including Massive ‘Leonardo’ System

October 15, 2020

The EuroHPC Joint Undertaking (JU) serves as Europe’s concerted supercomputing play, currently comprising 32 member states and billions of euros in funding. I Read more…

By Oliver Peckham

ROI: Is HPC Worth It? What Can We Actually Measure?

October 15, 2020

HPC enables innovation and discovery. We all seem to agree on that. Is there a good way to quantify how much that’s worth? Thanks to a sponsored white pape Read more…

By Addison Snell, Intersect360 Research

Preparing for Exascale Science on Day 1

October 14, 2020

Science simulation, visualization, data, and learning applications will greatly benefit from the massive computational resources available with future exascal Read more…

By Linda Barney

Supercomputer-Powered Research Uncovers Signs of ‘Bradykinin Storm’ That May Explain COVID-19 Symptoms

July 28, 2020

Doctors and medical researchers have struggled to pinpoint – let alone explain – the deluge of symptoms induced by COVID-19 infections in patients, and what Read more…

By Oliver Peckham

Nvidia Said to Be Close on Arm Deal

August 3, 2020

GPU leader Nvidia Corp. is in talks to buy U.K. chip designer Arm from parent company Softbank, according to several reports over the weekend. If consummated Read more…

By George Leopold

Intel’s 7nm Slip Raises Questions About Ponte Vecchio GPU, Aurora Supercomputer

July 30, 2020

During its second-quarter earnings call, Intel announced a one-year delay of its 7nm process technology, which it says it will create an approximate six-month shift for its CPU product timing relative to prior expectations. The primary issue is a defect mode in the 7nm process that resulted in yield degradation... Read more…

By Tiffany Trader

Google Hires Longtime Intel Exec Bill Magro to Lead HPC Strategy

September 18, 2020

In a sign of the times, another prominent HPCer has made a move to a hyperscaler. Longtime Intel executive Bill Magro joined Google as chief technologist for hi Read more…

By Tiffany Trader

HPE Keeps Cray Brand Promise, Reveals HPE Cray Supercomputing Line

August 4, 2020

The HPC community, ever-affectionate toward Cray and its eponymous founder, can breathe a (virtual) sigh of relief. The Cray brand will live on, encompassing th Read more…

By Tiffany Trader

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Aurora’s Troubles Move Frontier into Pole Exascale Position

October 1, 2020

Intel’s 7nm node delay has raised questions about the status of the Aurora supercomputer that was scheduled to be stood up at Argonne National Laboratory next year. Aurora was in the running to be the United States’ first exascale supercomputer although it was on a contemporaneous timeline with... Read more…

By Tiffany Trader

European Commission Declares €8 Billion Investment in Supercomputing

September 18, 2020

Just under two years ago, the European Commission formalized the EuroHPC Joint Undertaking (JU): a concerted HPC effort (comprising 32 participating states at c Read more…

By Oliver Peckham

Leading Solution Providers

Contributors

Is the Nvidia A100 GPU Performance Worth a Hardware Upgrade?

October 16, 2020

Over the last decade, accelerators have seen an increasing rate of adoption in high-performance computing (HPC) platforms, and in the June 2020 Top500 list, eig Read more…

By Hartwig Anzt, Ahmad Abdelfattah and Jack Dongarra

Google Cloud Debuts 16-GPU Ampere A100 Instances

July 7, 2020

On the heels of the Nvidia’s Ampere A100 GPU launch in May, Google Cloud is announcing alpha availability of the A100 “Accelerator Optimized” VM A2 instance family on Google Compute Engine. The instances are powered by the HGX A100 16-GPU platform, which combines two HGX A100 8-GPU baseboards using... Read more…

By Tiffany Trader

Nvidia and EuroHPC Team for Four Supercomputers, Including Massive ‘Leonardo’ System

October 15, 2020

The EuroHPC Joint Undertaking (JU) serves as Europe’s concerted supercomputing play, currently comprising 32 member states and billions of euros in funding. I Read more…

By Oliver Peckham

Oracle Cloud Infrastructure Powers Fugaku’s Storage, Scores IO500 Win

August 28, 2020

In June, RIKEN shook the supercomputing world with its Arm-based, Fujitsu-built juggernaut: Fugaku. The system, which weighs in at 415.5 Linpack petaflops, topp Read more…

By Oliver Peckham

Microsoft Azure Adds A100 GPU Instances for ‘Supercomputer-Class AI’ in the Cloud

August 19, 2020

Microsoft Azure continues to infuse its cloud platform with HPC- and AI-directed technologies. Today the cloud services purveyor announced a new virtual machine Read more…

By Tiffany Trader

DOD Orders Two AI-Focused Supercomputers from Liqid

August 24, 2020

The U.S. Department of Defense is making a big investment in data analytics and AI computing with the procurement of two HPC systems that will provide the High Read more…

By Tiffany Trader

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

Oracle Cloud Deepens HPC Embrace with Launch of A100 Instances, Plans for Arm, More 

September 22, 2020

Oracle Cloud Infrastructure (OCI) continued its steady ramp-up of HPC capabilities today with a flurry of announcements. Topping the list is general availabilit Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This