SC20 Panel – OK, You Hate Storage Tiering. What’s Next Then?

By John Russell

November 25, 2020

Tiering in HPC storage has a bad rep. No one likes it. It complicates things and slows I/O. At least one storage technology newcomer – VAST Data – advocates dumping the whole idea. One large-scale user, NERSC storage architect Glenn Lockwood sort of agrees. The challenge, of course, is that tiering is a practical necessity for the vast majority of HPC systems. Faster, cheaper, and more flexible as the new solid state hardware choices are, they are not yet cheap enough. And there’s a huge base of existing storage infrastructure (software and hardware) that’s not going away quickly.

An SC20 panel held last week – Diverse Approaches to Tiering HPC Storage – dug into current tiering trends and although it produced little agreement on when or precisely how storage tiering will enter its own end-of-life phase, the conversation covered a lot of ground. This was a large group of prominent panelists:  Glenn Lockwood, storage architect, NERSC/LBNL; Wayne Sawdon, CTO and strategy architect, IBM (Spectrum Scale); Jeff Denworth, VP of products and marketing, and co-founder, VAST Data; Liran Zvibel, CEO and founder, WekaIO; Curtis Anderson, senior software architect, Panasas; Matthew Starr, CTO, Spectra Logic; and Andreas Dilger, Lustre principal architect, Whamcloud (owned by DDN). The moderator was Addison Snell, CEO and founder, Intersect360 Research.

Lockwood’s early comments set the stage well and describe, at least in an aspirational sense, how many in HPC view storage tiering. He’s on the team standing up NERSC’s next supercomputer, Perlmutter. Here’s a somewhat extended (and lightly-edited) excerpt from his opening comments:

“We have a very large and very broad user base comprised of thousands of active users, hundreds of projects and hundreds of different applications every year. We also support not only traditional simulation workloads, but the large-scale data analytics workflows coming from experimental facilities such as telescopes and beam lines, and because of this breadth, our aggregate workload is very data intensive.

“Our storage systems typically see over an exabyte of I/O annually. Balancing this I/O intensive workload with the economics of storage means that at NERSC, we live and breathe tiering. And this is a snapshot of the storage hierarchy we have on the floor today at NERSC. Although it makes for a pretty picture, we don’t have storage tiering because we want to, and in fact, I’d go so far as to say it’s the opposite of what we and our users really want. Moving data between tiers has nothing to do with scientific discovery.

“To put some numbers behind this, last year we did a study that found that between 15% and 30% of that exabyte of I/O is not coming from our users’ jobs, but instead coming from data movement between storage tiers. That is to say that 15% to 30% of the I/O at NERSC is a complete waste of time in terms of advancing science. But even before that study, we knew that both the changing landscape of storage technology and the emerging large-scale data analysis and AI workloads arriving at NERSC required us to completely rethink our approach to tiered storage.

“Back in 2015, we began devising a strategic plan [for storage]. Our goal is ultimately to have as few tiers as possible for our users’ sake, while balancing the economic realities of storage media as they evolve over the next decade. Fortunately, economic trends are on our side for storage. And in anticipation of falling flash prices, we gave ourselves a goal of collapsing our performance tiers by 2020 to coincide with our 2020 HPC procurement, which is now called Perlmutter.

“I’m happy to report that we are on track with our 2020 goal. We’re currently in the process of deploying the world’s first all NVMe parallel file system at scale that will replace both our previous burst buffer tier and disk-based scratch tier. Once the dust is settled with Perlmutter, we’re aiming to further reduce down to just two tiers that map directly to the two fundamental categories of data that users store at NERSC: one a hot tier for data that users are actively using to carry out their work, and two a cold tier for data that users don’t actively need but are storing just in case they or someone else needs it somewhere down the road.

“By only having these two tiers, the only data movement that users need to worry about is moving data before and after their entire scientific project is carried out. At NERSC, this means that they will only have to worry about this once or twice per year. We will still keep two separate tiers because they represent two fundamentally different ways users interact with their data.

“The hot tier will be optimized for performance and getting data to and from user applications as fast as possible, so that they can spend their time advancing science rather than waiting for I/O. And their cold tier will be optimized for making data easy to search, index and share with their collaborators reliably. Because we don’t expect any magical media to hit the market between now and 2025, we’re relying on software to bridge the gap between different media types, so that even if there are both say flash and non-volatile memory and our hot tier, users don’t have to worry about which media is backing their data within that tier.”

Keep in mind Perlmutter is a government-funded supercomputer supported by technical and financial resources that are beyond the scope of most HPCers. That said Lockwood’s description of the problem and NERSC’s target solution echo the perspectives of many in the HPC user community.

Snell pointed out in shaping the discussion, “The implementation of solid state or flash storage has continued to grow among HPC users, and in most cases it exists together with, not separate from, conventional spinning disk hard drives. Having data on the right tier at the right time has become a compelling conversation point in determining what constitutes high performance storage, together with the potential latency and moving data between tiers. Meanwhile, the need for long term data stewardship is no less important. And to make things even more complicated, cloud computing is increasingly common in high performance computing, bringing in the notion of cloud is yet another data location.”

Because the panel was quite long (1.5 hours) presented here are just a few comments from each of the panelist on how their companies implement tiering.

PANASAS

Panasas’s view is conventional tiering, when all things are considered, just hurts price performance. You’ve got your compute cluster. You have a hot tier, that’s probably made of NVMe flash or something expensive. You have a cold tier that’s made out of hard drives and lower cost technologies. Then you have the storage management software layer that’s moving data back and forth. What do you get out of that? You get unpredictable performance. If you have guessed right, or your data has been recently used, then it’s in the hot tier [and] performance is not bad. If you guessed wrong or the workloads have changed or the compute cluster is overloaded, the data is still in the cold tier, and your nodes are idle while you’re pulling data up to the hot tier,” said Anderson.

“In addition, you get three separate management domains. There’s probably three separate products, they all need management. This is an example of temperature based data placement, how recently something has been accessed, determines where it lives, unless you’re reaching in and manually managing where data lives which has its own set of costs and issues,” he said.

“We believe that data placement based upon size, not on temperature, is the right solution. So we’ve built the ActiveStor Ultra around this concept. Hard drives, just as a as an example, are not terrible products from an HPC storage perspective, they’re great products for HPC. They deliver a tremendous amount of bandwidth, as long as you only give them large files and large sequential transfers to operate on. They’re very inexpensive per terabyte of storage, and they deliver great performance. So this [ActiveStor Ultra] is a an architecture that’s based on device efficiency, getting the most out of the fundamental building blocks of SSDs and HDDS that are part of the solution.

Panasas is getting a 2x performance advantage over competitive products, according to Anderson, although he didn’t specify which products. Using what he called a single tier approach, where data placement is based on size using a mix of technologies, is the key. “Comparing HPC storage products is difficult, because there’s such wildly different hardware. But if you boil it down to gigabytes delivered per 100 spindles of hard drive, then you get a comparison we believe is fair. The reason we’re getting this benefit is the architectural difference,” he said.

VAST DATA

VAST Data has vast aspirations and Denworth’s enthusiastic pitch matched those aspirations. Time will tell how successful this young company fares. About tiering he said, “Basically just throw the whole concept right out the window. We don’t see a lot of utility in the topic in 2020.” The key ingredients in VAST’s formula for dispensing with tiering include QLC flash memory, 3D XPoint, NVMe fabric, fresh coding, and buying scale.

“What we’ve done is we’ve combined a next generation computing layer that’s built in stateless Docker containers over a next generation NVMe fabric which can be implemented over Ethernet or InfiniBand, essentially, to disaggregate the containers from the underlying storage media, and make it such that they share all of the media, all of the NVMe devices within the cluster,” said Denworth. “That allows us to basically implement a new class of global codes, global codes around how we get to really, really efficient data protection. Imagine two and a half percent overhead for raid at scale without compromising on resilience, and global data reduction codes, such that you can get to now dramatic efficiency gains from where customers have come from in the past. And finally, global flash translation codes that get up to 10 years of use out of low-cost QLC flash that other storage systems can’t even use in their architectures.”

“When you put this together, you now have this stateless containerized architecture that can scale to exabytes, doesn’t cost any more for your infrastructure than what you paid for hard drive-based infrastructure, and the cost of flash is declining at a rate that’s much more aggressive than hard drives, such that we feel that we’re on the right side of that curve,” he said.

Denworth noted the rise of AI workloads as an important factor driving change, “AI changes the I/O paradigm altogether and where HPC systems of yesterday were designed to optimize for writes and then burst buffers came in and further optimized for writes. On the flip side, these new algorithms want to randomly read through the largest amounts of data at the fastest rates of speed, and you just can’t prefetch a random read. And we’re not alone on the [flash] island. organizations like the DOE Office of Science have concluded that the only way to efficiently feed next generation AI algorithms is to use random access flash.”

WEKAIO

WekaIO’s take on tiering was interesting. “It’s a central part of what we do,” said Zvibel. He describes WekaIO an enterprise grade storage file system for customers that need the NAS feature set but also need performance of a parallel file system. “If you go with NAS, you get feature rich enterprise grade, but you’re usually limited by your scale and performance. On the other hand, if you’re going with a parallel file system, you’re going to get a lot of scale, great throughput [but] be limited on mixed workloads and low latency I/Os.”

The WekaIO file system, says the company, has been built for use with flash and optimized for NVMe and cloud performance and agnostic about whose hardware is used (see architecture slide).

“You buy your commodity servers from vendors you like, put the Weka software on, you’re getting a parallel file system for NVMe over fabric. Then you install our clients on your compute and the I/O runs parallel to all of the NVMe-over-fabric servers and [provides] the extra performance. We also support GPU direct storage for the Nvidia GPUs, NFS, SMB and S3,” said Zvibel.

Regarding tiering, Zvibel said, “A lot of the projects have a portion of the data that is their active window. This could be hundreds of terabytes. It could be few petabytes or dozens of petabytes, but usually they have a lot more capacity that is stored, and they don’t need to access it for every run. For that we’re enabling tiering to any S3 enabled object storage and we can tier it to more than one on-prem in the cloud. You can have your data stored in separate locations. We’re not just tiering, we’re actually enabling a feature we called snap-to-object. So if you’re tiering to two locations, you can save snapshots and have the Weka system spun up on the other side, and basically keep running from that point in time.”

SPECTRA LOGIC

Starr from long-time tape powerhouse Spectra Logic briefly reviewed storage technology pyramid and noted it’s relative inefficiencies. He singled out the weakness of of metadata handling capabilities of file systems as an ongoing issue. He also singled out 3D XPoint’s load-store functionality as a potential gamer-changer “for what applications can do for doing things like snapshots; instead of writing to a disk or an NVMe system to a disk drive interface, or file system interface, [they can use] XPoint to do load-store and actually getting persistent storage.”

On balance, Starr sees the emergence of dynamic tiering across hybrid technologies as the wave of the future. “I think the model is going to [be one] where you end up with CPU RAM, XPoint, NVMe and compute together. That system will be the scratch file system, the sandbox for people to play in. Then [you’ll have] a separate storage area, made of hybrid [technologies] of the flash arrays, HDD, tape cloud, and most likely that’s going to have object ReSTful interface.” He thinks that latter interface is likely to “be an immutable interface so that as data’s coming into this scratch area, and being written back out, new versions are being written back out on the right side, when they need to be written out.”

Overall Starr suggested the following trends: “First, we’re seeing a move off HSMs (hierarchical storage management systems) – not saying those systems are going away tomorrow, but the idea that HSMS are going to be replaced with object storage interfaces. I think the new storage tiers like XPoint are going to start changing how applications are written, especially when you think about snapshots, or VM farms, how much RAM you can actually get into a server with 3D XPoint. Customers are going to look at open standards a lot more, like LTFS (linear tape file system). Those are the winners, just like Linux won the Unix war, LTFS is going to win the tape format, standard war.

“I think we are going to see a lot more ReSTful object storage interfaces and open standards being deployed where you can deploy content in the cloud [but still keep] a copy of that data on site for easy retrieval, but have a copy in the cloud to share it with other people. [I think we’ll see more] immutable archives. Lastly, I think that we’re going to be looking a lot more search [capability] and how we perform data capture and search before we put data deep into an archive? Those are the things I think are trending in the storage areas today.”

IBM

Spectrum Scale, formerly GPFS, has a long history of storage tiering support. “By 2000, two years after we introduced the product, we supported it for tape using the XDSM DMAPI standard. In 2006, we introduced the information Lifecycle Management, which divides the online storage into pools, allowing us to distinguish fast storage from slower storage. Along with the pools we introduced policy rules. In 2010, we introduced the active file management, which allows us to tier data across the wide area network to other sites in other locations,” said Swadon.

Wayne Sawdon, IBM

“In 2016, we introduced the transparent cloud sharing to allow us to move data to and from Cloud Storage. Spectrum Scale continue to invest into data tiering to extend its common namespace across data, lakes and object storage. We are also investing to transparently tier data into and out of client compute nodes using storage rich servers with local NVMe and persistent memory.”

Unlike Denworth, IBM sees storage tiering as something that will continue to be important. The goal, reiterated Sawdon, is to move the data as close to the computation as possible to reduce the time required to derive value from the data. IBM believes Spectrum Scale will keep evolving to meet the task of serving different media (and functional) cost-performance requirements.

“With today’s analytics on data, we see an increase in both the volume of the data and then the value of the data. High volume increases the demand for tiering to cheaper storage; high value increases the demand for tiering the high-performance storage to reduce the time for analytics. Thus, we conclude that data tiering is important now and will be even more important in the future,” said Sawdon.

Sawdon also emphasized Spectrum Scale’s software defined nature. “The benefit of being software defined is we can run on any hardware, we don’t care if we’re running on cloud hardware, we don’t care if it’s on prem. So we do have HPC deployments in the cloud. What we’re seeing for cloud deployments are [that] these are in fact, larger than our on-prem ones. Virtual machines are easy to spin up and getting 100,000 node clusters just happens. So we’re seeing lots of activity in that space. The interesting thing for Spectrum Scale is we’ve built common namespaces across different installations. You can build common namespaces between your cloud deployment and an on-prem deployment, and transparently move data if the customer wants to do it. Today, customers aren’t doing that yet. But we do have customers in the cloud who are running HPC.”

WHAMCLOUD/LUSTRE

Lustre has long been an HPC mainstay. Since DDN acquired Whamcloud (Lustre) from Intel, it has been working to incorporate common features used in enterprise file systems and to add ease of use features. Lustre underpins DDN’s EXAScaler product line (see HPCwire coverage). Dilger provided a brief overview of tiering in Lustre.

“The primary mechanism by which Lustre achieves storage tiering is through the use of storage pools to identify different storage technologies, and then the file layout which is stored on every file individually, and can be specified either as a default for the whole file system, per directory or per file. This allows a great deal of flexibility in terms of where files are located. To avoid applications or users trying to consume all of this storage space in a say flash pool on a system there are quotas to prevent abuse of the resources,” said Dilger

Typically, files are initially placed on a flash pool, then files can be mirrored to a different storage pool. “If the flash storage pool is becoming full, the policy agent can release the copy of older files and leave the one copy on the disk. It’s even possible to split a single file across different storage types. This is practical for very large files that can’t necessarily all fit into a single storage pool, or for files that have different types of data in them, for instance, an index at the start of the file, and then large data stored at the end of the file,” said Dilger.

In addition to managing storage on the server, Lustre can now also manage storage directly in the client. “This is through the persistent client cache. This allows files to be stored on local storage that are very large or need very low latency access. This leverages a local file system such as EC4, or NOVA. There are two types of storage for persistent client cache, either read-only mirror that’s shared among the file system and multiple clients or an exclusive copy where the client can write locally and then the data is mirrored asynchronously back to the Lustre file system,” he said.

Discussion around how quickly NVMe and FLASH and 3D XPoint technology would overturn the current tiering paradigm was lively but ultimately inconclusive. Best to watch the SC20 video for that interchange. Broadly, in the near-term, most panelists think dynamically tiering across media types will collapse tiers, at least in the sense that the tiers are increasingly invisible to uses and applications. Also, the persistence of POSIX, installed infrastructure (HDD et al) and improving tape performance suggest some form of discrete storage tiering will remain for quite some time.

There was a fair amount of discussion around cloud storage economics and interestingly also around AI application I/O patterns. The thinking on AI I/O patterns was that they were largely developed on smaller devices (laptops) without much thought about large-scale systems or storage systems behavior – that may change.

One balance, the panelists agree NVMe flash and 3D XPoint are the future, it’s a matter of when, and most of the panelists expect an evolution rather than abrupt replacement of existing storage tech.

Lockwood noted, “Ideally, economics is what allows us to keep collapsing tiers and not the other way around. So the project costs, how much we spend on storage, has no change. For Perlmutter, that is approximately true. Our previous system, Cori, I want to say the storage budget was somewhere between 10% and 15% of the total system cost. For Perlmutter, it was about the same. So we’re just leveraging the falling economics of flash. Where additional money probably will be needed is in the software and software enablement part of this, and whether or not you consider that part of the capital expenditure or non-recurring engineering or internal R&D effort is a much more complicated question.”

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Intel Reorgs HPC Group, Creates Two ‘Super Compute’ Groups

October 15, 2021

Following on changes made in June that moved Intel’s HPC unit out of the Data Platform Group and into the newly created Accelerated Computing Systems and Graphics (AXG) business unit, led by Raja Koduri, Intel is making further updates to the HPC group and announcing... Read more…

Royalty-free stock illustration ID: 1938746143

MosaicML, Led by Naveen Rao, Comes Out of Stealth Aiming to Ease Model Training

October 15, 2021

With more and more enterprises turning to AI for a myriad of tasks, companies quickly find out that training AI models is expensive, difficult and time-consuming. Finding a new approach to deal with those cascading challenges is the aim of a new startup, MosaicML, that just came out of stealth... Read more…

NSF Awards $11M to SDSC, MIT and Univ. of Oregon to Secure the Internet

October 14, 2021

From a security standpoint, the internet is a problem. The infrastructure developed decades ago has cracked, leaked and been patched up innumerable times, leaving vulnerabilities that are difficult to address due to cost Read more…

SC21 Announces Science and Beyond Plenary: the Intersection of Ethics and HPC

October 13, 2021

The Intersection of Ethics and HPC will be the guiding topic of SC21's Science & Beyond plenary, inspired by the event tagline of the same name. The evening event will be moderated by Daniel Reed with panelists Crist Read more…

Quantum Workforce – NSTC Report Highlights Need for International Talent

October 13, 2021

Attracting and training the needed quantum workforce to fuel the ongoing quantum information sciences (QIS) revolution is a hot topic these days. Last week, the U.S. National Science and Technology Council issued a report – The Role of International Talent in Quantum Information Science... Read more…

AWS Solution Channel

Cost optimizing Ansys LS-Dyna on AWS

Organizations migrate their high performance computing (HPC) workloads from on-premises infrastructure to Amazon Web Services (AWS) for advantages such as high availability, elastic capacity, latest processors, storage, and networking technologies; Read more…

Eni Returns to HPE for ‘HPC4’ Refresh via GreenLake

October 13, 2021

Italian energy company Eni is upgrading its HPC4 system with new gear from HPE that will be installed in Eni’s Green Data Center in Ferrera Erbognone (a province in Pavia, Italy), and delivered “as-a-service” via H Read more…

Intel Reorgs HPC Group, Creates Two ‘Super Compute’ Groups

October 15, 2021

Following on changes made in June that moved Intel’s HPC unit out of the Data Platform Group and into the newly created Accelerated Computing Systems and Graphics (AXG) business unit, led by Raja Koduri, Intel is making further updates to the HPC group and announcing... Read more…

Royalty-free stock illustration ID: 1938746143

MosaicML, Led by Naveen Rao, Comes Out of Stealth Aiming to Ease Model Training

October 15, 2021

With more and more enterprises turning to AI for a myriad of tasks, companies quickly find out that training AI models is expensive, difficult and time-consuming. Finding a new approach to deal with those cascading challenges is the aim of a new startup, MosaicML, that just came out of stealth... Read more…

Quantum Workforce – NSTC Report Highlights Need for International Talent

October 13, 2021

Attracting and training the needed quantum workforce to fuel the ongoing quantum information sciences (QIS) revolution is a hot topic these days. Last week, the U.S. National Science and Technology Council issued a report – The Role of International Talent in Quantum Information Science... Read more…

Eni Returns to HPE for ‘HPC4’ Refresh via GreenLake

October 13, 2021

Italian energy company Eni is upgrading its HPC4 system with new gear from HPE that will be installed in Eni’s Green Data Center in Ferrera Erbognone (a provi Read more…

The Blueprint for the National Strategic Computing Reserve

October 12, 2021

Over the last year, the HPC community has been buzzing with the possibility of a National Strategic Computing Reserve (NSCR). An in-utero brainchild of the COVID-19 High-Performance Computing Consortium, an NSCR would serve as a Merchant Marine for urgent computing... Read more…

UCLA Researchers Report Largest Chiplet Design and Early Prototyping

October 12, 2021

What’s the best path forward for large-scale chip/system integration? Good question. Cerebras has set a high bar with its wafer scale engine 2 (WSE-2); it has 2.6 trillion transistors, including 850,000 cores, and was fabricated using TSMC’s 7nm process on a roughly 8” x 8” silicon footprint. Read more…

What’s Next for EuroHPC: an Interview with EuroHPC Exec. Dir. Anders Dam Jensen

October 7, 2021

One year after taking the post as executive director of the EuroHPC JU, Anders Dam Jensen reviews the project's accomplishments and details what's ahead as EuroHPC's operating period has now been extended out to the year 2027. Read more…

University of Bath Unveils Janus, an Azure-Based Cloud HPC Environment

October 6, 2021

The University of Bath is upgrading its HPC infrastructure, which it says “supports a growing and wide range of research activities across the University.” Read more…

Ahead of ‘Dojo,’ Tesla Reveals Its Massive Precursor Supercomputer

June 22, 2021

In spring 2019, Tesla made cryptic reference to a project called Dojo, a “super-powerful training computer” for video data processing. Then, in summer 2020, Tesla CEO Elon Musk tweeted: “Tesla is developing a [neural network] training computer... Read more…

Enter Dojo: Tesla Reveals Design for Modular Supercomputer & D1 Chip

August 20, 2021

Two months ago, Tesla revealed a massive GPU cluster that it said was “roughly the number five supercomputer in the world,” and which was just a precursor to Tesla’s real supercomputing moonshot: the long-rumored, little-detailed Dojo system. Read more…

Esperanto, Silicon in Hand, Champions the Efficiency of Its 1,092-Core RISC-V Chip

August 27, 2021

Esperanto Technologies made waves last December when it announced ET-SoC-1, a new RISC-V-based chip aimed at machine learning that packed nearly 1,100 cores onto a package small enough to fit six times over on a single PCIe card. Now, Esperanto is back, silicon in-hand and taking aim... Read more…

CentOS Replacement Rocky Linux Is Now in GA and Under Independent Control

June 21, 2021

The Rocky Enterprise Software Foundation (RESF) is announcing the general availability of Rocky Linux, release 8.4, designed as a drop-in replacement for the soon-to-be discontinued CentOS. The GA release is launching six-and-a-half months... Read more…

US Closes in on Exascale: Frontier Installation Is Underway

September 29, 2021

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, held by Zoom this week (Sept. 29-30), it was revealed that the Frontier supercomputer is currently being installed at Oak Ridge National Laboratory in Oak Ridge, Tenn. The staff at the Oak Ridge Leadership... Read more…

Intel Completes LLVM Adoption; Will End Updates to Classic C/C++ Compilers in Future

August 10, 2021

Intel reported in a blog this week that its adoption of the open source LLVM architecture for Intel’s C/C++ compiler is complete. The transition is part of In Read more…

Intel Reorgs HPC Group, Creates Two ‘Super Compute’ Groups

October 15, 2021

Following on changes made in June that moved Intel’s HPC unit out of the Data Platform Group and into the newly created Accelerated Computing Systems and Graphics (AXG) business unit, led by Raja Koduri, Intel is making further updates to the HPC group and announcing... Read more…

Hot Chips: Here Come the DPUs and IPUs from Arm, Nvidia and Intel

August 25, 2021

The emergence of data processing units (DPU) and infrastructure processing units (IPU) as potentially important pieces in cloud and datacenter architectures was Read more…

Leading Solution Providers

Contributors

AMD-Xilinx Deal Gains UK, EU Approvals — China’s Decision Still Pending

July 1, 2021

AMD’s planned acquisition of FPGA maker Xilinx is now in the hands of Chinese regulators after needed antitrust approvals for the $35 billion deal were receiv Read more…

HPE Wins $2B GreenLake HPC-as-a-Service Deal with NSA

September 1, 2021

In the heated, oft-contentious, government IT space, HPE has won a massive $2 billion contract to provide HPC and AI services to the United States’ National Security Agency (NSA). Following on the heels of the now-canceled $10 billion JEDI contract (reissued as JWCC) and a $10 billion... Read more…

Julia Update: Adoption Keeps Climbing; Is It a Python Challenger?

January 13, 2021

The rapid adoption of Julia, the open source, high level programing language with roots at MIT, shows no sign of slowing according to data from Julialang.org. I Read more…

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

Quantum Roundup: IBM, Rigetti, Phasecraft, Oxford QC, China, and More

July 13, 2021

IBM yesterday announced a proof for a quantum ML algorithm. A week ago, it unveiled a new topology for its quantum processors. Last Friday, the Technical Univer Read more…

The Latest MLPerf Inference Results: Nvidia GPUs Hold Sway but Here Come CPUs and Intel

September 22, 2021

The latest round of MLPerf inference benchmark (v 1.1) results was released today and Nvidia again dominated, sweeping the top spots in the closed (apples-to-ap Read more…

Frontier to Meet 20MW Exascale Power Target Set by DARPA in 2008

July 14, 2021

After more than a decade of planning, the United States’ first exascale computer, Frontier, is set to arrive at Oak Ridge National Laboratory (ORNL) later this year. Crossing this “1,000x” horizon required overcoming four major challenges: power demand, reliability, extreme parallelism and data movement. Read more…

Intel Unveils New Node Names; Sapphire Rapids Is Now an ‘Intel 7’ CPU

July 27, 2021

What's a preeminent chip company to do when its process node technology lags the competition by (roughly) one generation, but outmoded naming conventions make i Read more…

  • arrow
  • Click Here for More Headlines
  • arrow
HPCwire