Container App ‘Singularity’ Eases Scientific Computing

By Tiffany Trader

October 20, 2016

HPC container platform Singularity is just six months out from its 1.0 release but already is making inroads across the HPC research landscape. It’s in use at Lawrence Berkeley National Laboratory (LBNL), where Singularity founder Gregory Kurtzer has worked in the High Performance Computing Services (HPCS) group for 16 years, and it’s going into other leading HPC centers, including the Texas Advanced Computing Center (TACC), the San Diego Supercomputing Center (SDSC) and many more sites, large and small.

Singularity is just the latest of several large open source projects that Kurtzer, who is the technical lead and HPC architect at LBNL and also at UC Berkeley through a joint appointment, has founded and built. He also created the well-known open source projects Centos Linux and Warewulf (the latter is now a key part of the emerging OpenHPC stack). With these projects ongoing and Singularity attracting a lot of buzz, Kurtzer has never been busier. The day I talked with him he had just delivered two presentations extolling the merits of Singularity and was about to give another the following day.

Kurtzer, who just rolled out Singularity version 2.2 (see release details here), says there’s a reason for the fast adoption curve. Scientists had seen what Docker could do for enterprise and were clamoring to gain the same benefits for their workflows on HPC infrastructures. With Singularity, users can take their complete application environment and reproducibly run it anywhere Singularity is installed.

In this Q&A with HPCwire, Kurtzer lays out the use case for Singularity and discusses the evolving HPC usage model that is driving the need for container solutions within the traditional HPC ecosystem.

HPCwire: What does Singularity offer scientific computing users?

Kurtzer: Singularity has three primary goals from a user perspective: reproducibility, mobility and end user freedom.

An area receiving a lot of attention right now in scientific computing is the reproducibility of results relying on computation and data, and this leads to the necessity to come up with mechanisms to ensure complete reproducibility of the entire stack. Just because we can assemble everything from pieces – the source code, the operating system, libraries and so forth – that’s not necessarily a guarantee that we can reproduce the exact same environment that was used to create a particular scientific result.

One of the ideas behind Singularity is that the container itself is represented by a single file image and that single file can be copied. I can share it with you, we can open up the permissions on that file so you and me and that entire group can run it. And we can archive that file and archive it with our data, and you can email it to yourself. All because it’s a single file.

If you look at Docker and other container solutions, it’s not a single file; it’s usually spread out among directories, or layers of archives. So Docker for example will use a whole bunch of small tarballs to recreate an image and that makes it very difficult to archive for scientists.

Singularity is designed around the idea of the extreme simplicity of a single file so we can ensure reproducibility and mobility of compute, which means if it works for me on my laptop running Ubuntu, it will work on a high-performance computing resource running Centos and it will work on another high-performance computing system running Scientific Linux and it’ll just run everywhere including up in the cloud. Again, this is because that file contains the entire encapsulation necessary for the application to run reproducibility — so that satisfies both the reproducibility and the mobility feature requirements.

And the last one is freedom, and that basically means if a developer or a user has built their application or workflow for Ubuntu or Linux Mint or Alpine or whatever their favorite Linux distro is, they can build a container around it and then transfer that file everywhere that Singularity is installed and they will always be able to work inside of that operating system image that they built and control. So it gives a lot of freedom to the end users and the developers, and if you couple that with the idea of mobility, people are now starting to do releases of software built around the idea that their software exists inside of that Singularity image and they can now just distribute that Singularity image.

This means you don’t have to distribute 14 different things that everyone has to assemble by hand. In order to replicate a particular workflow, they can say, this container does all of the bioscience research that this particular set of experiments needs. So you just download that container and you’ve got it; you don’t have to install anything onto the host operating system because it’s all inside that container.

HPCwire: Can you explain in more detail why operating system flexibility is important?

Kurtzer: As a resource provider, we’ll typically buy our hardware or we’ll get an integrated system and it will come with an operating system installed and we’ll spend a bunch of time tuning it, and building software applications for it and polishing it up for the users, and making it into something that is not only tuned, works properly for the hardware, is optimized for our use cases, got our applications, but we also have to support it. And so we spend a lot of effort building the host operating system on any HPC resource. But then there will always be those users where that particular choice of operating systems will not meet their needs. Or they will need different core library versions, that are not simply upgradable. No matter what we do, it’s not going to make everybody happy — not to be negative, but it’s just the reality.

Singularity gives users that freedom that they want to bring their own environment. Now how important is that? It really depends on the applications. Let’s say somebody wants to use TensorFlow or the various bio related Perl or Python stacks (notoriously hard to install). Is it impossible to build it on a Red Hat derivative of an operating system, like Centos or Scientific Linux? No it’s not impossible, but it’s a huge amount of work, especially when you consider that various developers are releasing these stacks ready to go for Debian and for Ubuntu. It makes the justification of not using Debian and Ubuntu very difficult.

HPCwire: Are there implications for versioning as well?

Kurtzer: Definitely with libraries. A lot of software vendors will release binary versions only and many of these binary versions will be linked against a particular library or library version that is not available on a particular operating system. So for example, if you’re linking against something that is easy to upgrade, not part of the core operating system, in many cases, you can do that using a variable called the LD_LIBRARY_PATH and you can in fact run it against different libraries, but if you’re linking against something that is part of the core library like the C library, well that you can’t just upgrade; it’s pretty much impossible, because the moment you try to upgrade it you run the risk of breaking the entire operating system because the entire operating system is based on that exact version of the C library.

So core library and package versioning is a big deal with containers and it does free you from that. So for example, if I’ve got an application that requires a certain version of the C library that my host doesn’t provide you can get around that using a container because everything in the container is completely independent from the C library.

HPCwire: Can you trace the path from the rise of containers, notably Docker, to the creation of Singularity?

singularity-architecture_g-kurtzer
       Source: Gregory Kurtzer, LBNL

Kurtzer: The commonly understood notion of containers stems from the use cases of virtual machines. This means that containers not only share many of the same features, but also similar use cases, specifically service-level virtualization. Generally containers are a more optimal manner to achieve service-level virtualization because of the proximity of the applications to the physical hardware; with full hardware virtualization, there are not only layers of emulation, but also redundancy which affect performance.

Docker is one of the most commonly known container solutions because it is more than just a run-time environment. It’s also a build environment for containers. It’s a sharing platform. It’s an entire platform enabled to support containers and they nailed it for the enterprise use case. But scientists are a resourceful bunch and they saw some of the features like the reproducibility aspects of the Docker platform and the fact that they can leverage each other’s’ work and they jumped on board. There are a lot of scientists that have pushed a huge amount of work into Docker containers and those same scientists then turned to the HPC centers wanting to be able to use those containers on an HPC system, and the HPC service providers were turning back to them and saying, “we can’t integrate Docker, it just doesn’t fit!” There’s not one HPC center that has a full integration of Docker on a traditional HPC environment. There are custom-built Docker solutions that exist but you’re not using traditional HPC that we’ve all come to know and love and spend gobs of money on, so they’re not using high-performance interconnects, they’re not running MPI-like jobs, they’re not using the high performance or inter-galactic parallel file systems, etc. Essentially they have removed the High Performance out of HPC and thus leaving only compute.

That’s where Singularity came to be. We need to satisfy the requirements of the scientists. We need be able to leverage the work that they’ve already done, that’s already in Docker. Singularity has the ability to run a container right out of the Docker registry and/or push that over to an HPC resource and run it over there as well.

HPCwire: What kind of traction and adoption have you had?

Kurtzer: I did the first release (1.0) in April 2016, and I just released version 2.2. Now I can’t tell you what sites are running it and what sites are not because most people don’t contact me to tell me about it, they’re just running the code. But I can tell you out of the ones who have contacted me and spoke with me, I know right now TACC is in the process of integrating it, SDSC has already integrated it on 65,000 cores, University of Florida on 51,000 cores, NIH on 50,000 cores. Our site has 38,000 cores, which includes our institutional Lawrencium cluster. University of Nebraska is also at about 14,000 cores.

Additionally there are other sites also running Singularity: UC Berkeley, UC Irvine, Notre Dame, Stanford, Cambridge also have it installed and there’s even more; I just don’t know about them all.

One of the major factors in the massive uptake has to do with the simplicity of installing and configuring Singularity. Singularity is a single package installation (RPM or DEB) and once it is installed, users can immediately begin utilizing Singularity containers interactively, in their workflows and/or via the resource manager. This has huge benefits for HPC service providers in that there is no need to modify the system architecture, scheduler configuration, or security paradigm in order to utilize Singularity on traditional existing HPC resources.

HPCwire: When you look at the strata of HPC workloads and users, where is the biggest need for Singularity?

Kurtzer: The type of science that we’re seeing is not as much as the typical HPC jobs; but rather the areas of research that are just starting to utilize HPC (for example humanities, social science, public policy, business, political science, etc.). Sometimes called the long tail of science, there are potentially huge areas of research that can make use of HPC. I personally have seen a huge amount of uptake from the long tail of science – the scientists that are not writing compiled code, the scientists that are not as worried about massive scalability. We’re also seeing a lot of uptake along the lines of non-compiled code, which could run at scale. We have some pyMPI jobs that could run across thousands of nodes and tens of thousands of cores and doesn’t necessarily run very efficiently but the cost of human development time is a lot more than paying for CPU time, and non-compiled, higher level languages (such as Python) facilitate this.

HPCwire: How is Singularity different from Shifter, the container solution developed at NERSC?

Kurtzer: While there is some overlap in terms of the container runtime implementations (e.g. both projects use namespaces, maintain user credentials, and support HPC), the primary distinctions are within the fundamental design goals and thus implementation features.

For example, Shifter is designed around the idea of implementing Docker on HPC and the best way to implement that is really not to use Docker at all (as I mentioned, Docker is designed for service level virtualization, not HPC). Considering users have been asking for Docker support on HPC for years, NERSC satisfied those requests by implementing Shifter which takes Docker images and shifts them to a format and mode of operation which can be utilized on large HPC resources. Shifter also integrates with the resource manager to properly execute jobs natively within a specified container.

Singularity, on the other hand, focuses on the mobility and reproducibility of the images themselves and uses these images as the vector of portability. Additionally, Singularity does no integration with the resource manager and relies on the users to manage their container based workflows completely.

One place where we are planning on joining forces is for Shifter to also accept Singularity images (in addition to Docker containers).

More information about Singularity.

More information about Shifter.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

A Beginner’s Guide to the ASC19 Finals

April 22, 2019

Three thousand watts. That's how much power the competitors in the 2019 ASC Student Supercomputer Challenge here in Dalian, China, have to work with. Everybody would like more juice to run compute-intensive HPC simulatio Read more…

By Alex Woodie

Is Data Science the Fourth Pillar of the Scientific Method?

April 18, 2019

Nvidia CEO Jensen Huang revived a decade-old debate last month when he said that modern data science (AI plus HPC) has become the fourth pillar of the scientific method. While some disagree with the notion that statistic Read more…

By Alex Woodie

At ASF 2019: The Virtuous Circle of Big Data, AI and HPC

April 18, 2019

We've entered a new phase in IT -- in the world, really -- where the combination of big data, artificial intelligence, and high performance computing is pushing the bounds of what's possible in business and science, in w Read more…

By Alex Woodie with Doug Black and Tiffany Trader

HPE Extreme Performance Solutions

HPE and Intel® Omni-Path Architecture: How to Power a Cloud

Learn how HPE and Intel® Omni-Path Architecture provide critical infrastructure for leading Nordic HPC provider’s HPCFLOW cloud service.

powercloud_blog.jpgFor decades, HPE has been at the forefront of high-performance computing, and we’ve powered some of the fastest and most robust supercomputers in the world. Read more…

IBM Accelerated Insights

Bridging HPC and Cloud Native Development with Kubernetes

The HPC community has historically developed its own specialized software stack including schedulers, filesystems, developer tools, container technologies tuned for performance and large-scale on-premises deployments. Read more…

Google Open Sources TensorFlow Version of MorphNet DL Tool

April 18, 2019

Designing optimum deep neural networks remains a non-trivial exercise. “Given the large search space of possible architectures, designing a network from scratch for your specific application can be prohibitively expens Read more…

By John Russell

A Beginner’s Guide to the ASC19 Finals

April 22, 2019

Three thousand watts. That's how much power the competitors in the 2019 ASC Student Supercomputer Challenge here in Dalian, China, have to work with. Everybody Read more…

By Alex Woodie

At ASF 2019: The Virtuous Circle of Big Data, AI and HPC

April 18, 2019

We've entered a new phase in IT -- in the world, really -- where the combination of big data, artificial intelligence, and high performance computing is pushing Read more…

By Alex Woodie with Doug Black and Tiffany Trader

Interview with 2019 Person to Watch Michela Taufer

April 18, 2019

Today, as part of our ongoing HPCwire People to Watch focus series, we are highlighting our interview with 2019 Person to Watch Michela Taufer. Michela -- the Read more…

By HPCwire Editorial Team

Intel Gold U-Series SKUs Reveal Single Socket Intentions

April 18, 2019

Intel plans to jump into the single socket market with a portion of its just announced Cascade Lake microprocessor line according to one media report. This isn Read more…

By John Russell

BSC Researchers Shrink Floating Point Formats to Accelerate Deep Neural Network Training

April 15, 2019

Sometimes calculating solutions as precisely as a computer can wastes more CPU resources than is necessary. A case in point is with deep learning. In early stag Read more…

By Ken Strandberg

Intel Extends FPGA Ecosystem with 10nm Agilex

April 11, 2019

The insatiable appetite for higher throughput and lower latency – particularly where edge analytics and AI, network functions, or for a range of datacenter ac Read more…

By Doug Black

Nvidia Doubles Down on Medical AI

April 9, 2019

Nvidia is collaborating with medical groups to push GPU-powered AI tools into clinical settings, including radiology and drug discovery. The GPU leader said Monday it will collaborate with the American College of Radiology (ACR) to provide clinicians with its Clara AI tool kit. The partnership would allow radiologists to leverage AI techniques for diagnostic imaging using their own clinical data. Read more…

By George Leopold

Digging into MLPerf Benchmark Suite to Inform AI Infrastructure Decisions

April 9, 2019

With machine learning and deep learning storming into the datacenter, the new challenge is optimizing infrastructure choices to support diverse ML and DL workfl Read more…

By John Russell

The Case Against ‘The Case Against Quantum Computing’

January 9, 2019

It’s not easy to be a physicist. Richard Feynman (basically the Jimi Hendrix of physicists) once said: “The first principle is that you must not fool yourse Read more…

By Ben Criger

Why Nvidia Bought Mellanox: ‘Future Datacenters Will Be…Like High Performance Computers’

March 14, 2019

“Future datacenters of all kinds will be built like high performance computers,” said Nvidia CEO Jensen Huang during a phone briefing on Monday after Nvidia revealed scooping up the high performance networking company Mellanox for $6.9 billion. Read more…

By Tiffany Trader

ClusterVision in Bankruptcy, Fate Uncertain

February 13, 2019

ClusterVision, European HPC specialists that have built and installed over 20 Top500-ranked systems in their nearly 17-year history, appear to be in the midst o Read more…

By Tiffany Trader

Intel Reportedly in $6B Bid for Mellanox

January 30, 2019

The latest rumors and reports around an acquisition of Mellanox focus on Intel, which has reportedly offered a $6 billion bid for the high performance interconn Read more…

By Doug Black

It’s Official: Aurora on Track to Be First US Exascale Computer in 2021

March 18, 2019

The U.S. Department of Energy along with Intel and Cray confirmed today that an Intel/Cray supercomputer, "Aurora," capable of sustained performance of one exaf Read more…

By Tiffany Trader

Looking for Light Reading? NSF-backed ‘Comic Books’ Tackle Quantum Computing

January 28, 2019

Still baffled by quantum computing? How about turning to comic books (graphic novels for the well-read among you) for some clarity and a little humor on QC. The Read more…

By John Russell

IBM Quantum Update: Q System One Launch, New Collaborators, and QC Center Plans

January 10, 2019

IBM made three significant quantum computing announcements at CES this week. One was introduction of IBM Q System One; it’s really the integration of IBM’s Read more…

By John Russell

Deep500: ETH Researchers Introduce New Deep Learning Benchmark for HPC

February 5, 2019

ETH researchers have developed a new deep learning benchmarking environment – Deep500 – they say is “the first distributed and reproducible benchmarking s Read more…

By John Russell

Leading Solution Providers

SC 18 Virtual Booth Video Tour

Advania @ SC18 AMD @ SC18
ASRock Rack @ SC18
DDN Storage @ SC18
HPE @ SC18
IBM @ SC18
Lenovo @ SC18 Mellanox Technologies @ SC18
NVIDIA @ SC18
One Stop Systems @ SC18
Oracle @ SC18 Panasas @ SC18
Supermicro @ SC18 SUSE @ SC18 TYAN @ SC18
Verne Global @ SC18

IBM Bets $2B Seeking 1000X AI Hardware Performance Boost

February 7, 2019

For now, AI systems are mostly machine learning-based and “narrow” – powerful as they are by today's standards, they're limited to performing a few, narro Read more…

By Doug Black

The Deep500 – Researchers Tackle an HPC Benchmark for Deep Learning

January 7, 2019

How do you know if an HPC system, particularly a larger-scale system, is well-suited for deep learning workloads? Today, that’s not an easy question to answer Read more…

By John Russell

Arm Unveils Neoverse N1 Platform with up to 128-Cores

February 20, 2019

Following on its Neoverse roadmap announcement last October, Arm today revealed its next-gen Neoverse microarchitecture with compute and throughput-optimized si Read more…

By Tiffany Trader

Intel Launches Cascade Lake Xeons with Up to 56 Cores

April 2, 2019

At Intel's Data-Centric Innovation Day in San Francisco (April 2), the company unveiled its second-generation Xeon Scalable (Cascade Lake) family and debuted it Read more…

By Tiffany Trader

France to Deploy AI-Focused Supercomputer: Jean Zay

January 22, 2019

HPE announced today that it won the contract to build a supercomputer that will drive France’s AI and HPC efforts. The computer will be part of GENCI, the Fre Read more…

By Tiffany Trader

Oil and Gas Supercloud Clears Out Remaining Knights Landing Inventory: All 38,000 Wafers

March 13, 2019

The McCloud HPC service being built by Australia’s DownUnder GeoSolutions (DUG) outside Houston is set to become the largest oil and gas cloud in the world th Read more…

By Tiffany Trader

Intel Extends FPGA Ecosystem with 10nm Agilex

April 11, 2019

The insatiable appetite for higher throughput and lower latency – particularly where edge analytics and AI, network functions, or for a range of datacenter ac Read more…

By Doug Black

UC Berkeley Paper Heralds Rise of Serverless Computing in the Cloud – Do You Agree?

February 13, 2019

Almost exactly ten years to the day from publishing of their widely-read, seminal paper on cloud computing, UC Berkeley researchers have issued another ambitious examination of cloud computing - Cloud Programming Simplified: A Berkeley View on Serverless Computing. The new work heralds the rise of ‘serverless computing’ as the next dominant phase of cloud computing. Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This