Part Two: Navigating Life Sciences Choppy HPC Waters in 2018

By John Russell

March 8, 2018

2017 was not necessarily the best year to build a large HPC system for life sciences say Ari Berman, VP and GM of consulting services, and Aaron Gardner, director of technology, for research computing consultancy, BioTeam. Perhaps that’s true more generally as well. The reason is there were enough new technology options entering or expected soon to market – think AMD’s EPYC processor line, Intel Skylake, and IBM’s Power9 chip – that choosing wisely among them could seem premature. The jolt of Spectre-Meltdown in early 2018 hasn’t helped settle the waters.

In part one of HPCwire’s examination of 2018 HPC trends in life sciences, published last week, Berman and Gardner talked about AI trends and cloud use in life sciences. In Part Two, presented here, they consider the prospect of real challenge to Intel’s dominance in the processor landscape, the proliferation of storage technology options, and the rising need for fast networking in life sciences.

HPCwire: The processor market hasn’t been this frothy for a long time with AMD showing real traction and IBM hoping for the same. Are we looking at real change?

Ari Berman, BioTeam

Ari Berman: I agree in 2017 and 2018 diversity is the name of the game. For a long time, Intel was the only game in town with the Xeons and then they tried to break in with Xeon Phi and co-processing. I think the really interesting issue now is that Intel took a gamble by going more to a platform model for CPU in Skylake and in some ways, that’s paid off, and in some ways it hasn’t. I think that particularly in the HPC space, independent of Spectre-Meltdown, there were some performance problems with the Skylake architecture. It’s pretty common in version one of any new product. But it took some of the major HPC centers to figure out that there were problems that the chip designer didn’t anticipate. Intel is sort of running to catch up in a lot of ways, down to microcode which is really hard to catch up on. At the same time, AMD has surged out and EPYC has gained some footing in this market to.

It’s a really interesting time for core processing, and at the same time Power9 has come out. No one knows what kind of an impact that is going to have. IBM is definitely making a power play to this space and suddenly the dome has crumbled a little bit on the Intel monopoly.

Aaron Gardner, BioTeam

Aaron Gardner: We see a lot of good things happening on the horizon. When the Ryzen CPUs came out I bought one immediately just because it was something new. Early testing is looking really good with the EPYC family on the server side, based on Naples (architecture). I think we are even more interested in the forthcoming Rome architecture and what we’ll see come out of that. I think the industry as a whole in the last year has been cautiously optimistic about what it means to have AMD involved again. I think we are going to see a lot of rejuvenation in the CPU space.

Our advice to folks is this is not a time to just use what you have always used; instead look at the playing field and look at all options. We certainly are taking that approach with the organizations we work with. I do think interoperability across architectures is going to become more and more important for many clients while some people do like to pick a particular architecture and ultra-optimize, but I think we are seeing all of the landscape moving forward where there are multiple architectures in play, and that is important.

HPCwire: Betting against Intel is not often a good idea. Are we seeing a real and persistent change in the processor landscape?

Ari Berman: That’s a great question. The safe answer is, it depends. What it depends upon is the ability of both AMD and IBM to deliver. That was the problem in the past and has remained the problem. IBM’s additional problem is cost. It’s always much more expensive to go with an IBM processor than the other two. At scale that matters immensely. [For example] the exascale folks are going to work minimize processor cost and power consumption, and Power9 is not a cheap processor both in power utilization and in raw cost. There may be some benefits to it but we still have not seen any real penetration of the Power platform in life sciences.

However, we are seeing clusters being built with EPYC and the Intel family of processors. It really depends. Arm is also right in there. The problem with Arm is, much like GPUs, a lot of the algorithms have to be refactored and recompiled and tested to see if they work the same because it is not an x86 architecture. So, is the activation energy required by the scientific use cases worth the effort to convert to an all Arm system? 

Aaron Gardner: And people need to be prepared to move back if necessary. There absolutely has been a transition in the last year, but the staying power of that transition is an important question. That’s why we are looking towards the Rome architecture with AMD. We are seeing some interesting signals from vendors around them diversifying their CPU architecture. How much that leads to further change and lasting change in the industry is going to have to do with the lift that is required to realize that change. From what we are seeing now, the lift is not much on AMD platforms to optimize for life sciences workloads. That’s not to say that Arm doesn’t have a play there, but it is definitely not as quick a lift and not as generalized a solution. IBM has done a lot with Power to make it something that is tenable, but we’ll have to wait and see there.

HPCwire: Are you hearing worries about AMD’s commitment to the market?

Aaron Gardner: Earlier on, we definitely heard that story repeated by OEMs, vendors, and partners, but we are hearing less of that now. Again, we’ll check back in a year. I think that would be a knee jerk reaction if there are bumps in the road again. I think everyone wants to see a diversified playing field in the CPU space but people have memories of that previous pull back from AMD and also Intel’s ability to engineer their way back into dominance. Both of those stories are deeply ingrained in the industry psyche.

HPCwire. Talking about scarring the psyche. The thunderbolt of Spectre-Meltdown is stirring serious worry in both the user and vendor communities. What’s your take?

Aaron Gardner: It’s true. The Spectre and Meltdown security vulnerabilities are still reverberating and in play. We’ll see how it all plays out. One of the interesting things, going back to cloud computing, then I will get back on CPU, the cloud providers, due to the nature of those vulnerabilities had an obligation to mitigate those vulnerabilities quickly. Very quickly mitigation measures were put in place across cloud providers. But there are performance implications with those patches. If you have local infrastructure, you can kind of choose your approach and your stance. But in the cloud, you must accept the providers’ approaches.

We saw messages from the high-performance storage community pointing out that applying the Spectre and Meltdown patches to storage clients had a tremendous impact on storage performance. We also had many letters coming out from the storage vendors speaking to their stances on how to approach the vulnerability. The summation there is that having a diversified CPU portfolio, especially if you are on-premise and off premise, as well as multi-cloud, just gives people some hedges and adaptability to navigate the CPU waters. I think Spectre-Meltdown really showed the industry that there can be speed bumps along the way with different CPUs, so being able to move workloads across architectures can become an important consideration.

Ari Berman: I’ll add to that. Just as in a multi-cloud environment, it’s important to understand what different processors and different platforms bring to you. Take the time to understand [your needs]. Does your workload require a lot of PCI links, [then] AMD is the thing for you. If you still need a lot of high-powered integer calculations, maybe Skylake is still the thing for you.

HPCwire: Let’s shift gears and talk about storage.

Ari Berman: BioTeam vacillates between what’s our biggest issue in life sciences and it vacillates between networking and storage and data management. In the last year or so, the problem has shifted back to storage from networking as a major issue. The main issue is there’s a lot of hype and a lot of diversity in the storage market, and people are realizing that the storage market is incredibly overpriced for what it delivers. Also, the available technologies that are coming out – files systems, the dropping prices of flash, and PCIe switching – all of that stuff all has the potential to transform this entire space.

At the same time the need for multi-petabyte storage by almost everybody has really driven life sciences. We are literally almost to the point that any laboratory with significant data generation capability needs to be peta-capable, because over the course of a year or two they have a single device that generates a petabyte of data. That was a major challenge a couple of years ago. Today, having to manage one or ten petabytes is common place. But the power costs and the management costs haven’t changed at all on scale. That’s one of the major challenges.

A few years ago, we did see a shift away from a scale-out NAS to parallel distributed file systems in life sciences because of the scalability. That [coincided] with speed improvements. That shift continues on some fronts but there’s a bit of disillusionment on what those parallel distributed file systems capabilities are and vendors are actually being pushed to deliver what they say they can.

One other thing is that cloud has become a semi-major player, at least in long-term storage and data sharing. The thing is, as per usual, you have to use those cloud resources very carefully because if you are sharing ten petabytes of genomics data in Amazon, that’s going to cost you a whole lot of money. Same thing in Google, Google storage is very expensive as well. Again, the interesting thing is, if the data you’re storing in the cloud is something you need to access a lot outside of the cloud environment, and this is the challenge of using a cloud environment, you are going to find yourself paying tens of thousands dollars each time you move that data out of the cloud. That’s the opportunity cost that clouds charge and their business model is to lock you into using their stuff, otherwise you pay a lot for it.

Aaron Gardner: One of the big things we’ve seen over the past couple of years is people moving to tiered storage systems and creating storage workflows that facilitate movement through storage tiers and to move data out through the right network segments. One of the things that is driving movement to tiered storage is that the cost of storage changes depending upon the context of the workload, so the idea of hot storage and cold storage and everything in between becomes important. At a certain scale, you can no longer have a one-size fits all storage solution.

I think where people are now is they are realizing that all of these different types of storage they have bought with all of the orchestration, middleware layers and software for moving data around has mitigated the cost over the last few years, but added so much complexity that when new things are introduced – such as doing analytics across all of the data you have to get value, maybe driving training data sets, or for doing a data commons and things like that – all of a sudden you are putting all of your storage into action in a way it wasn’t before.

You’ll notice a lot of storage vendors are talking about their read access patterns and read requirements with deep learning and how that is different from what they did with HPC workloads in the past. Workload access patterns are changing. More and more of the written data is being accessed later, but we are also seeing now that we are getting more [into] analytics and that people are sobering up a lot about, “we have stored everything until now [but] we actually haven’t accessed a lot of it; is it still valuable?” People are trying to do those exercises more and more. We are certainly seeing an increase in data commons efforts within organizations to organize and make accessible all of the actionable data an organization has.

HPCwire: Broadly, is there a rush to new storage architectures to accommodate things like greater analytics demands?

Aaron Gardner: Not holistically. I would say bifurcation is a trend in the sense that proven strategies are still working as well as new approaches. We’ve been involved in some very large traditional HPC buildouts, architectures, and specification type work recently, and actually we still found that going with tried and true platforms with a best practices design in distributed parallel storage architecture is providing a reliable means of storage. So, in some ways it’s not like everything we have been doing for the last five years or decade is getting thrown out. On the other hand, I would say traditional storage hasn’t changed much, while there continue to be improvements. There’s the evolution of Lustre. There’s all of the work on the CORAL program done for GPFS. Those are two stable files systems that are still prevalent in HPC and so there is very much sustaining innovation happening.

We also have things like cloud, distributed infrastructure, all these things pushing the new software defined storage paradigm that’s been growing from marketing hype to reality. There’s absolutely a sea of storage offerings in that space. Some of it is chasing performance to harness the capabilities of NAND storage and other new storage mediums like 3D Xpoint/Optane type stuff; other parts of it are trying to chase the realities of what has been realized at hyperscale. In terms of economies of scale, a lot of the practices that have been present in web scale and hyperscale customers are now becoming common in software defined storage offerings, and they definitely are being consumed by researchers in the life sciences. So, we’re seeing a one-two punch. Tried and true methods are the most reliable but we also are seeing absolutely a shift into next generation files systems and storage.

HPCwire: What’s happening with Lustre? Last year, Ari was not especially high on it.

Aaron Gardner: Lustre continues to improve. If you look at the leading edge of the releases, it is becoming more tenable for life sciences workloads. The challenge this year is the rapidity at which those features are adopted and supported in the vendor space. I still see it being a couple of years before you can rely on the latest changes and adaptations being present in whatever offering you buy. You really have to be aware of where a particular Lustre vendor stands with respect to all the work that is being done. We are still seeing more adoption of GPFS as far as life sciences workloads. It continues to improve and get better as well, but has some similar challenges in the vendor space. Also, if you drill down into some of the emerging benchmarks with BeeGFS, it does really well with metadata which is becoming more and more important. We are seeing people picking up on BeeGFS in Europe but still waiting to see critical mass stateside.

HPCwire: Object storage has enjoyed a fair amount of buzz the year. What are you seeing in terms use cases and adoption trend?

Ari Berman: The interesting thing about object storage is that it still remains the tactic that hyperscalers are using to manage their sprawling data web services on top of, but it still isn’t penetrating that far, at least in the life sciences space, into everyday utilization with our customer base. Everybody is talking about it. People have proof of concepts. Some folks even have extensive distributions. But, what is still missing is the adoption to use it in [its] native data format. Folks are still using emulators and translators in front of object storage. My prediction for this year coming, because of the increased cloud adoption and some algorithm generation, is that folks are going to start seeing native object support as a part of algorithms [in applications]. But the kicker is that most object storage has been sold as an archive solution and doesn’t have the speed built into it that real analytics would need. [The result] is everyone sees object storage as a cheap and deep type of solution, rather than a scaling analytics solution, and that’s going to be a hard thing to overcome in the market because you can’t do it at speed.

Aaron: My sense is this past year is when object storage just became common place. It’s not new and shiny anymore. It’s just an accepted part of the landscape. I would say there are a couple different sides to object storage. One is that more and more of the foundational bedrock of storage right above the media level is addressed at the object level; even Lustre storage has had object as part of the inherent design. So, there’s that back-end architectural consideration. We’re seeing more next generation efforts from the storage vendors continue to assume an object storage back end, which is really helpful in the cloud.

At the front end, it purely is a protocol consideration. One of the things we’re seeing more of, and this is to Ari’s point about the adoption of object storage at the application level, is that while object storage is one access method that works really well for web scale type use, folks are starting to now make object more of a protocol use case consideration, that your storage knows both POSIX and object. It’s just understood that for modern storage you are going to have a good object front end as well. A POSIX front end is needed sometimes and sometimes it’s an object name space being represented.

Relating to Ari’s comment on analytics performance, the new and shiny thing is NVMe over fabric. Vendors are pushing that through their proprietary implementations or taking more open standards approaches. That’s absolutely something that is taking hold and we will see more adoption moving forward as a way to take care of the concern of needing low latency access to IOPs over the network.

HPCwire: High speed networking has always been important in HPC but perhaps less so in life sciences. That seems to be changing. Could you give us a brief overview of the state of networking in life sciences HPC?

Ari Berman: The most obvious statement I can make with networking is that the byproduct of producing a lot of data is that you have to move it. Everyone talks about delivering the compute to the data and that’s great but you still have to move the data there and you still have to back it up, and manage it, and lifecycle it, and share it, and a lot of these are things that folks aren’t doing. All of those things require high performance or at least performance-minded networking built underneath the systems.

Many organizations have 10Gb or 40Gb capability, and in some cases 100Gb capability, but the security that’s put in front of it limits everything down to 20Gb batches. High performance security is something that’s lagging way behind. The whole Science DMZ network architecture traded a more distributed security model around data intensive science that enabled performant networking and sort of solved that problem, but typical security shops just haven’t adopted and/or don’t understand or aren’t comfortable without a physical firewall and many levels of abstraction.

What’s really interesting, and I am going to quote Chris Dagdigian (BioTeam co-founder and Senior Director of Infrastructure), is that we have organizations that are still smug about having 10Gb, which was released 16 years ago. It’s an old technology at this point. It’s just reached the point of wide adoption and cost effectiveness such that you actually cannot have an enterprise network without some 10Gb paths in it, especially in scientific environments. As such 100Gb is quickly becoming the standard on a scientific enterprise scale.

Mellanox ConnectX-6 200Gbs Ethernet

The 100Gb stuff scares people, but it is at the point where it is easily manageable and easily deployable. It doesn’t necessarily cost a ridiculous amount. You do have to relocate some of your optics with 10Gb and 40Gb to be able to make it work well. But, mostly it’s pretty easy to deploy. Then there’s the 400Gb standard, which was featured at SC17 this year as a part of SCinet. This is something that’s really happening, and vendors have released a 400Gb Ethernet standard. So, that moves the ball forward again towards a terabit, which is probably coming out in the next year or so. This is an innovation curve that is needed in the industry.

A fast HPC connected at 1 gigabit is not going to work. Build a fast HPC connected with a 100Gb and it might work. The challenge remains that moving data at speed is a hard thing to do. Having a single server move data at a 100Gb is almost an impossibility, even though 100Gb cards exist for those machines – you have to balance the IO with the PCI bus perfectly to even approach those speeds on a single machine. We’ve done a couple tests where we have reached 72Gb, but it was with sample data. It’s really, really hard to do. Just to drive the point home, PCI express is the bottleneck right now for single server transfer speeds to work well. Very clearly, a lot of people have been screaming for PCIe4. It would be great for a big player like Intel to release it for a lot of reasons. If that standard were to come out sooner, that would allow us to shift the conversation again.

HPCwire: What’s your take on the jockeying between Ethernet and InfiniBand for sway in HPC?

Ari Berman: We have seen this interesting phase in the market where 100Gb Ethernet is similarly priced with EDR InfiniBand. What that means is everyone says, “well we can get 100Gb either way, so let’s use something that’s easy to use and we don’t have to have RDMA or we can have a system use RDMA over Converged Ethernet or something like that if we use these specialized switches.” The truth with HPC, my personal opinion, is Ethernet is still a wonky protocol to use on the back-end of a truly scaling HPC system. It still causes issues for TCP, even within a contained system, that reduces some of the performance gains you get by having a low latency, lossless networking protocol like OmniPath or InfiniBand. The interesting thing within life sciences is that low latency isn’t really tremendously needed for the analytics for parallelism, but it is really needed for the delivering and absorption of data to and from the storage systems at scale. If you have 100,000 cores or 10,000 nodes trying to access a single file system, you have to have an incredible network, not to mention an incredible storage system, to be able to deliver that data to the cluster as a whole. Those are the challenges we are seeing.

Here’s another challenge. We have designed and we have seen designs where the back end of a cluster is entirely InfiniBand or OmniPath, because you can build those to support TCP/IP and the native protocols work well for analytics. The problem then is how do you scale that out into the Ethernet space. So, you have to have gateways, and InfiniBand routers, and all these interesting rather convoluted things that sometimes work well and sometimes don’t work well but also have to operate at speed. There’s a lot of kluges in the market there.

We are seeing Arista 100Gb Ethernet installed in the middle of clusters now because they have hit a price point that is similar to Mellanox and Intel and those work fine, and it’s something folks know and doesn’t require extra knowledge to operate. We’ve seen some of the NSF supercomputing centers put in OmniPath. They really like it. It works well. But we haven’t seen wide adoption across life sciences. At least in life sciences, Mellanox is still the king here, and the interesting thing about Mellanox is they have responded to the competition of OmniPath by quadrupling their capabilities.

Aaron Gardner: Many people are always trying to guess when InfiniBand is going to die out and Ethernet will take hold. We still see InfiniBand demand being strong going forward while we note that Mellanox continues to sell more and more Ethernet. That pushes a healthy diversified ecosystem and you have players like Arista in the mix as well. On OPA, we’ve seen it used to good effect when people want to adopt Intel’s platform strategy.

I think the thing that would perhaps slow Intel, and why we haven’t seen OPA translate to the general market at the same time it has with the leadership class, is what we are seeing with CPU diversification in terms of maybe you want some Arm, maybe you want some AMD in part of your environment too. People then crave a uniformity at the networking layer which is why I think InfiniBand and Ethernet will still stay in HPC and hold their lines and even increase market share. I don’t see a single emerging solution for the network fabric space.

HPCwire: Thank you both for your time and the overview. Let’s check the scorecard next year.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Global QC Market Projected to Grow to More Than $800 million by 2024

September 28, 2020

The Quantum Economic Development Consortium (QED-C) and Hyperion Research are projecting that the global quantum computing (QC) market - worth an estimated $320 million in 2020 - will grow at an anticipated 27% CAGR betw Read more…

By Staff Reports

DoE’s ASCAC Backs AI for Science Program that Emulates the Exascale Initiative

September 28, 2020

Roughly a year after beginning formal efforts to explore an AI for Science initiative the Department of Energy’s Advanced Scientific Computing Advisory Committee last week accepted a subcommittee report calling for a t Read more…

By John Russell

Supercomputer Research Aims to Supercharge COVID-19 Antiviral Remdesivir

September 25, 2020

Remdesivir is one of a handful of therapeutic antiviral drugs that have been proven to improve outcomes for COVID-19 patients, and as such, is a crucial weapon in the fight against the pandemic – especially in the abse Read more…

By Oliver Peckham

NOAA Announces Major Upgrade to Ensemble Forecast Model, Extends Range to 35 Days

September 23, 2020

A bit over a year ago, the United States’ Global Forecast System (GFS) received a major upgrade: a new dynamical core – its first in 40 years – called the finite-volume cubed-sphere, or FV3. Now, the National Oceanic and Atmospheric Administration (NOAA) is bringing the FV3 dynamical core to... Read more…

By Oliver Peckham

AI Silicon Startup Graphcore Launches Channel Partner Program

September 23, 2020

AI compute platform vendor Graphcore has launched its first formal global channel partner program to promote and boost the sales of its AI processors and blade computing products. The formalized, all-new Graphcore Elite Partner Program follows the company’s past history of working with several... Read more…

By Todd R. Weiss

AWS Solution Channel

The Water Institute of the Gulf runs compute-heavy storm surge and wave simulations on AWS

The Water Institute of the Gulf (Water Institute) runs its storm surge and wave analysis models on Amazon Web Services (AWS)—a task that sometimes requires large bursts of compute power. Read more…

Intel® HPC + AI Pavilion

Berlin Institute of Health: Putting HPC to Work for the World

Researchers from the Center for Digital Health at the Berlin Institute of Health (BIH) are using science to understand the pathophysiology of COVID-19, which can help to inform the development of targeted treatments. Read more…

Arm Targets HPC with New Neoverse Platforms

September 22, 2020

UK-based semiconductor design company Arm today teased details of its Neoverse roadmap, introducing V1 (codenamed Zeus) and N2 (codenamed Perseus), Arm’s second generation N-series platform. The chip IP vendor said the new platforms will deliver 50 percent and 40 percent more... Read more…

By Tiffany Trader

DoE’s ASCAC Backs AI for Science Program that Emulates the Exascale Initiative

September 28, 2020

Roughly a year after beginning formal efforts to explore an AI for Science initiative the Department of Energy’s Advanced Scientific Computing Advisory Commit Read more…

By John Russell

NOAA Announces Major Upgrade to Ensemble Forecast Model, Extends Range to 35 Days

September 23, 2020

A bit over a year ago, the United States’ Global Forecast System (GFS) received a major upgrade: a new dynamical core – its first in 40 years – called the finite-volume cubed-sphere, or FV3. Now, the National Oceanic and Atmospheric Administration (NOAA) is bringing the FV3 dynamical core to... Read more…

By Oliver Peckham

Arm Targets HPC with New Neoverse Platforms

September 22, 2020

UK-based semiconductor design company Arm today teased details of its Neoverse roadmap, introducing V1 (codenamed Zeus) and N2 (codenamed Perseus), Arm’s second generation N-series platform. The chip IP vendor said the new platforms will deliver 50 percent and 40 percent more... Read more…

By Tiffany Trader

Oracle Cloud Deepens HPC Embrace with Launch of A100 Instances, Plans for Arm, More 

September 22, 2020

Oracle Cloud Infrastructure (OCI) continued its steady ramp-up of HPC capabilities today with a flurry of announcements. Topping the list is general availabilit Read more…

By John Russell

European Commission Declares €8 Billion Investment in Supercomputing

September 18, 2020

Just under two years ago, the European Commission formalized the EuroHPC Joint Undertaking (JU): a concerted HPC effort (comprising 32 participating states at c Read more…

By Oliver Peckham

Google Hires Longtime Intel Exec Bill Magro to Lead HPC Strategy

September 18, 2020

In a sign of the times, another prominent HPCer has made a move to a hyperscaler. Longtime Intel executive Bill Magro joined Google as chief technologist for hi Read more…

By Tiffany Trader

Future of Fintech on Display at HPC + AI Wall Street

September 17, 2020

Those who tuned in for Tuesday's HPC + AI Wall Street event got a peak at the future of fintech and lively discussion of topics like blockchain, AI for risk man Read more…

By Alex Woodie, Tiffany Trader and Todd R. Weiss

IBM’s Quantum Race to One Million Qubits

September 15, 2020

IBM today outlined its ambitious quantum computing technology roadmap at its virtual Quantum Summit. The eye-popping million qubit number is still far out, agrees IBM, but perhaps not that far out. Just as eye-popping is IBM’s nearer-term plan for a 1,000-plus qubit system named Condor... Read more…

By John Russell

Supercomputer-Powered Research Uncovers Signs of ‘Bradykinin Storm’ That May Explain COVID-19 Symptoms

July 28, 2020

Doctors and medical researchers have struggled to pinpoint – let alone explain – the deluge of symptoms induced by COVID-19 infections in patients, and what Read more…

By Oliver Peckham

Nvidia Said to Be Close on Arm Deal

August 3, 2020

GPU leader Nvidia Corp. is in talks to buy U.K. chip designer Arm from parent company Softbank, according to several reports over the weekend. If consummated Read more…

By George Leopold

10nm, 7nm, 5nm…. Should the Chip Nanometer Metric Be Replaced?

June 1, 2020

The biggest cool factor in server chips is the nanometer. AMD beating Intel to a CPU built on a 7nm process node* – with 5nm and 3nm on the way – has been i Read more…

By Doug Black

Intel’s 7nm Slip Raises Questions About Ponte Vecchio GPU, Aurora Supercomputer

July 30, 2020

During its second-quarter earnings call, Intel announced a one-year delay of its 7nm process technology, which it says it will create an approximate six-month shift for its CPU product timing relative to prior expectations. The primary issue is a defect mode in the 7nm process that resulted in yield degradation... Read more…

By Tiffany Trader

Google Hires Longtime Intel Exec Bill Magro to Lead HPC Strategy

September 18, 2020

In a sign of the times, another prominent HPCer has made a move to a hyperscaler. Longtime Intel executive Bill Magro joined Google as chief technologist for hi Read more…

By Tiffany Trader

HPE Keeps Cray Brand Promise, Reveals HPE Cray Supercomputing Line

August 4, 2020

The HPC community, ever-affectionate toward Cray and its eponymous founder, can breathe a (virtual) sigh of relief. The Cray brand will live on, encompassing th Read more…

By Tiffany Trader

Neocortex Will Be First-of-Its-Kind 800,000-Core AI Supercomputer

June 9, 2020

Pittsburgh Supercomputing Center (PSC - a joint research organization of Carnegie Mellon University and the University of Pittsburgh) has won a $5 million award Read more…

By Tiffany Trader

European Commission Declares €8 Billion Investment in Supercomputing

September 18, 2020

Just under two years ago, the European Commission formalized the EuroHPC Joint Undertaking (JU): a concerted HPC effort (comprising 32 participating states at c Read more…

By Oliver Peckham

Leading Solution Providers

Contributors

Oracle Cloud Infrastructure Powers Fugaku’s Storage, Scores IO500 Win

August 28, 2020

In June, RIKEN shook the supercomputing world with its Arm-based, Fujitsu-built juggernaut: Fugaku. The system, which weighs in at 415.5 Linpack petaflops, topp Read more…

By Oliver Peckham

Google Cloud Debuts 16-GPU Ampere A100 Instances

July 7, 2020

On the heels of the Nvidia’s Ampere A100 GPU launch in May, Google Cloud is announcing alpha availability of the A100 “Accelerator Optimized” VM A2 instance family on Google Compute Engine. The instances are powered by the HGX A100 16-GPU platform, which combines two HGX A100 8-GPU baseboards using... Read more…

By Tiffany Trader

DOD Orders Two AI-Focused Supercomputers from Liqid

August 24, 2020

The U.S. Department of Defense is making a big investment in data analytics and AI computing with the procurement of two HPC systems that will provide the High Read more…

By Tiffany Trader

Supercomputer Modeling Tests How COVID-19 Spreads in Grocery Stores

April 8, 2020

In the COVID-19 era, many people are treating simple activities like getting gas or groceries with caution as they try to heed social distancing mandates and protect their own health. Still, significant uncertainty surrounds the relative risk of different activities, and conflicting information is prevalent. A team of Finnish researchers set out to address some of these uncertainties by... Read more…

By Oliver Peckham

Microsoft Azure Adds A100 GPU Instances for ‘Supercomputer-Class AI’ in the Cloud

August 19, 2020

Microsoft Azure continues to infuse its cloud platform with HPC- and AI-directed technologies. Today the cloud services purveyor announced a new virtual machine Read more…

By Tiffany Trader

Japan’s Fugaku Tops Global Supercomputing Rankings

June 22, 2020

A new Top500 champ was unveiled today. Supercomputer Fugaku, the pride of Japan and the namesake of Mount Fuji, vaulted to the top of the 55th edition of the To Read more…

By Tiffany Trader

Joliot-Curie Supercomputer Used to Build First Full, High-Fidelity Aircraft Engine Simulation

July 14, 2020

When industrial designers plan the design of a new element of a vehicle’s propulsion or exterior, they typically use fluid dynamics to optimize airflow and in Read more…

By Oliver Peckham

Intel Speeds NAMD by 1.8x: Saves Xeon Processor Users Millions of Compute Hours

August 12, 2020

Potentially saving datacenters millions of CPU node hours, Intel and the University of Illinois at Urbana–Champaign (UIUC) have collaborated to develop AVX-512 optimizations for the NAMD scalable molecular dynamics code. These optimizations will be incorporated into release 2.15 with patches available for earlier versions. Read more…

By Rob Farber

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This