How to Spot a Legacy Storage Vendor

January 18, 2021

Point #1: Selling Systems Built with a Proprietary “Tin”

Surprisingly, the term “tin” came out in other blogs to which Mellor referred and was not brought up by me.

Storage systems started as highly engineered hardware-based solutions in the `80s and `90s of the last century. To advance the innovations in resiliency, density, and performance, storage system designers back then were forced to engineer special buses, memory, and resilient enclosures. Then in the early 2000s there was a strong movement toward COTS (Commodity off-the-shelf) hardware-based solutions that were driven by software. I was a successful part of that movement at XIV (acquired by IBM), where we were the first to build a Tier-1 block\SAN solution with no custom-engineered hardware. Although we were not designing any hardware, all of the solutions back then still relied on specialized components from obscure/niche “shelves”; even these COTS solutions were not the standard servers you could just buy from server vendors, such as Hewlett Packard Enterprise or Supermicro. We relied on specialty vendors like Xyratex or others.

Fast forward to today. With the advancement in flash and NVMe technology, networking architectures, and server platforms, it is possible to build best-of-breed storage solutions using totally standard off-the-shelf components that are widely available from pretty much any server vendor.

Summary #1: If you are still using a storage solution that is available only in a customized proprietary hardware form factor from your storage vendor, and you cannot run it on the server vendor that you like doing business with, it’s a clear indication you’re not using a solution based on current design principles, and you’re buying a legacy storage.

This brings me to the second point from my tweet–about the cloud.

Point #2: Cloud Offering Is Different & Compromised Compared to Proprietary “Tin” → Hybrid Cloud is Limited

Legacy vendors that have solutions based on their own proprietary “tins” with hardware dependency cannot run the same software on public cloud infrastructures. They have to develop a different solution. Sure, they may give it the same name, sneaking “Cloud” in the title, but it is a totally different solution from that which runs on the custom “tin.” In fact, they even can run as a managed service on the proprietary tin inside the public cloud data center, but that is not what the cloud is about.

If they want to run on cloud infrastructures, then they must either port a subset of their solutions to run on the cloud, while changing some design elements, or create completely new products to run on the cloud. Either way, the functionality of legacy vendors’ products on the cloud is not the same as that which customers enjoy on premises (on the private cloud). The integrations need to be different, which means that scalability, resiliency, and performance of the solution on the public cloud is different.

Many of the organizations we are working with are transitioning toward a hybrid cloud model, running the bulk of their workloads on their on-premises private clouds, while having the ability to burst to the public cloud for elasticity or for DR (Disaster Recovery). This allows organizations to leverage the compelling economics of on-premises storage–and the compelling economics of cloud-DR (don’t pay unless a drill or actual disaster) and bursting using spot instances.

At Weka we were able to show that we got an enterprise-grade shared-file storage solution on AWS to the top 3 positions of the IO500 table, where the first two solutions are not even commercial offerings in GA.

Summary #2: If your current storage vendor does not support exactly the same specification for features, CLI, and performance, and if it doesn’t scale the same on-premises and in the cloud, these are clear indications that you’re using a legacy vendor.

Let’s keep going.

Point #3: Limited Scale and Support for Mixed Workloads → Tons of Silos

Modern design principles allow the creation of highly distributed storage systems that can grow to hundreds of petabytes and even exabytes. Also, it is possible to come up with IO stacks that allow running diverse applications on the same system.

Legacy solutions were created with design principles, limitations, and tradeoffs that prevented coming up with large scale–in terms of capacity or performance scaling. Moreover, this design forced tuning parameters (this was also referenced in the blogs!) that limit each system performance envelope to certain types of IOs. When data sets were relatively small and storage systems were single purpose, this innovation worked, but today’s workloads are exploring data in the petascale, and IO patterns are unknown.

You shouldn’t have to consider what your system limitations might be if you want to tune for small IOs and low latency (high IOPS number) versus large IOs with high throughput or many metadata operations. You should be able to run them all over the same data on a single scalable system built on modern hardware designs. Period.

Another important requirement, for both on-premises and the cloud, is having the ability to expand a system on demand while it is online. Otherwise, you’re stuck with a horrible sizing effort each time you start a new project and buy a system. Good up-to-date storage systems must have the ability to expand, to add more capacity or performance while in heavy production usage.

Summary #3: Go back and think about your datacenter architecture decisions. If you must deploy more storage systems than functionally required by physical separation of resources, you’re using a legacy storage vendor. If you must make sizing decisions that lock you in for the life of that system, you’re also using a legacy storage vendor.

There’s more….

Point #4: Limited Aggregate Performance & Single Client Performance → Unfit for GPU

About a decade ago it became obvious that Moore’s law ceased to exist. The enterprise world had to solve larger, more complex problems, which led to extreme scale-out, and due to limitations of legacy storage vendors it also led to the proliferation of “Big Data” that tried to circumvent those limitations.

In the last few years GPUs have entered the data centers, and they provide significantly more compute capacity than was possible using even very large CPU compute farms.

A single, modern GPU-filled server today can reach to about 5 PetaFLOPS. This performance is equivalent to the world’s top supercomputer of about a decade ago. These supercomputers that filled rooms a decade ago had IO systems that could reach dozens of GB/s aggregate throughput to get them rolling. Now that this compute capacity fits in a single box, it still needs that kind of IO throughput, and when leveraging multiple GPU-filled servers the aggregate throughput needs to be even greater.

Future-looking companies are migrating their workloads to GPUs. To be effective, each GPU-based server needs to access data that historically would have equaled the aggregate throughput of dozens or hundreds of legacy CPU servers connected to different storage systems. This trend started with AI\ML, but now we see many examples across pharmaceutical companies, financial organizations, and retail establishments where customers are replacing large-scale, “Big Data,” open-source solutions that fill entire data centers with a single machine or a few racks of infrastructure.

In order to leverage and unleash the power of the GPU platform, storage systems need to “up their game” in terms of aggregate overall performance and, even more importantly, single-client performance; otherwise, they face the risk of wasting the valuable compute cycles of these dense compute systems.

Summary #4: If the storage system you’re using now has the same single-client performance limitations that existed about a decade ago, you’re using a legacy storage vendor. If the system you’re using now has an aggregate throughput\IOPS number that has not increased dramatically when compared with numbers a decade ago, you’re using a legacy storage vendor.

Let explore the final point from my tweet.

Point #5: Data Backup and DR Are Performed by Others–or They’re Afterthoughts

Data Protection (backup\archive\etc.) is a huge responsibility for any storage professional. Back in the time when storage systems were limited in scale and workload, mobility to the cloud was unheard of, backup was a simple procedure, which led to the proliferation of “secondary storage” backup vendors.

With data capacities growing exponentially, businesses demand a sound DR strategy, a cloud bursting strategy, and also backup and archive on-premises and on the cloud. Viewing these solutions as discrete and solving each one individually with separate products is expensive and wasteful, however. When your data grows to the petabyte scale you don’t want to store too many copies of it to satisfy separate uses (three for archive, another one for DR, etc.) Also, you do not want to integrate several different products to do that.

Now, if we considered the “bread and butter” features of storage resiliency and integrity, here are some questions to ask, the answers of which could indicate that you’re using a legacy vendor from these perspectives:

Does your storage system crawl to its knees when it is 75% full?
Does storage system performance crawl to a halt during rebuilds, or as long some hardware component is still faulty?
Do you still have to build 100% of the failed storage media even when it had little data?

Summary #5: If your primary storage product, your backup product, and your cloud product are different entities, you are dealing with a legacy storage vendor. If your storage vendor forces you to treat backup and DR differently and store the data twice, you’re using a legacy vendor. If your storage vendor forces you to go to a third-party solution in order to get sound backup or archive strategy, you’re also using a legacy vendor. If the performance of the storage system drops significantly during rebuild, you’re using a legacy vendor. If you are still rebuilding blocks and not files, you are using a legacy vendor. If you don’t have end-to-end data integrity protection to the client (each block has a checksum that is calculated at the client and verified on each step of the way to ensure no bit-rot), you’re using a legacy storage vendor.

There’s one more point to make.

A Bonus: Point #6

I can go on longer about what makes a good storage solution, and this is probably a good subject for the next blog. I have many positive aspects to share about up-to-date storage solutions that have different parallels from those described here—that is, other than those regarding the characteristics of legacy storage vendors.

Speaking of which, I’m going to add another identification cue to this guide even though it was not part of my original tweet. Another strong indication that you’re buying from a legacy storage vendor is this: if that vendor has many offerings, and you have to use different products to achieve your goal, this is an indication that what you’re currently trying to achieve was not the original design point of the solution offered to you.

Summary #6: If your storage vendor has many products, each with slightly different tradeoffs, and you have to use a different mix of them as solutions to different projects, you’re using a legacy vendor.

Conclusion

It’s important to reiterate that all new storage vendors get to build on the great technologies of the legacy vendors. I respect all of Weka’s competitors in the marketplace and acknowledge that the majority of the storage solutions that are sold today are solid–and obviously still have a place in the data center of today. As CIOs start to think about the datacenters of the future and what objectives they will have to meet in years to come, they are well served to question whether the solution that took them to 2020 will meet the needs of the future. Each generation produces a giant leap in performance and scale.

By the way, if you’re still curious as to my opinion about the original matter, both NetApp and Pure are obviously selling legacy storage products by the standards described here. Clearly, both are legacy storage vendors.

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Research senior analyst Steve Conway, who closely tracks HPC, AI, Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, and this day of contemplation is meant to provide all of us Read more…

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

April 22, 2024

As we find ourselves on the brink of a technological revolution, the need for efficient and sustainable computing solutions has never been more critical. A computer system that can mimic the way humans process and s Read more…

Empowering High-Performance Computing for Artificial Intelligence

April 19, 2024

Artificial intelligence (AI) presents some of the most challenging demands in information technology, especially concerning computing power and data movement. As a result of these challenges, high-performance computing Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that have occurred about once a decade. With this in mind, the ISC Read more…

2024 Winter Classic: Texas Two Step

April 18, 2024

Texas Tech University. Their middle name is ‘tech’, so it’s no surprise that they’ve been fielding not one, but two teams in the last three Winter Classic cluster competitions. Their teams, dubbed Matador and Red Read more…

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

April 23, 2024

The recent 2024 EuroHPC Summit meeting took place in Antwerp, with attendance substantially up since 2023 to 750 participants. HPCwire asked Intersect360 Resear Read more…

AI Saves the Planet this Earth Day

April 22, 2024

Earth Day was originally conceived as a day of reflection. Our planet’s life-sustaining properties are unlike any other celestial body that we’ve observed, Read more…

Kathy Yelick on Post-Exascale Challenges

April 18, 2024

With the exascale era underway, the HPC community is already turning its attention to zettascale computing, the next of the 1,000-fold performance leaps that ha Read more…

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

April 18, 2024

Horizon Quantum Computing, a Singapore-based quantum software start-up, announced today it would build its own testbed of quantum computers, starting with use o Read more…

MLCommons Launches New AI Safety Benchmark Initiative

April 16, 2024

MLCommons, organizer of the popular MLPerf benchmarking exercises (training and inference), is starting a new effort to benchmark AI Safety, one of the most pre Read more…

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

April 15, 2024

As the AI revolution marches on, it is vital to continually reassess how this technology is reshaping our world. To that end, researchers at Stanford’s Instit Read more…

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

April 11, 2024

The chip market is facing a crisis: chip development is now concentrated in the hands of the few. A confluence of events this week reminded us how few chips Read more…

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

April 11, 2024

Yesterday Quantonation — which promotes itself as a one-of-a-kind venture capital (VC) company specializing in quantum science and deep physics — announce Read more…

Nvidia H100: Are 550,000 GPUs Enough for This Year?

August 17, 2023

The GPU Squeeze continues to place a premium on Nvidia H100 GPUs. In a recent Financial Times article, Nvidia reports that it expects to ship 550,000 of its lat Read more…

Synopsys Eats Ansys: Does HPC Get Indigestion?

February 8, 2024

Recently, it was announced that Synopsys is buying HPC tool developer Ansys. Started in Pittsburgh, Pa., in 1970 as Swanson Analysis Systems, Inc. (SASI) by John Swanson (and eventually renamed), Ansys serves the CAE (Computer Aided Engineering)/multiphysics engineering simulation market. Read more…

Intel’s Server and PC Chip Development Will Blur After 2025

January 15, 2024

Intel's dealing with much more than chip rivals breathing down its neck; it is simultaneously integrating a bevy of new technologies such as chiplets, artificia Read more…

Choosing the Right GPU for LLM Inference and Training

December 11, 2023

Accelerating the training and inference processes of deep learning models is crucial for unleashing their true potential and NVIDIA GPUs have emerged as a game- Read more…

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

October 30, 2023

With long lead times for the NVIDIA H100 and A100 GPUs, many organizations are looking at the new NVIDIA L40S GPU, which it’s a new GPU optimized for AI and g Read more…

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

January 5, 2024

Reuters reported this week that Baidu, China’s giant e-commerce and services provider, is exiting the quantum computing development arena. Reuters reported � Read more…

Google Addresses the Mysteries of Its Hypercomputer

December 28, 2023

When Google launched its Hypercomputer earlier this month (December 2023), the first reaction was, "Say what?" It turns out that the Hypercomputer is Google's t Read more…

How AMD May Get Across the CUDA Moat

October 5, 2023

When discussing GenAI, the term "GPU" almost always enters the conversation and the topic often moves toward performance and access. Interestingly, the word "GPU" is assumed to mean "Nvidia" products. (As an aside, the popular Nvidia hardware used in GenAI are not technically... Read more…

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

January 25, 2024

In under two minutes, Meta's CEO, Mark Zuckerberg, laid out the company's AI plans, which included a plan to build an artificial intelligence system with the eq Read more…

China Is All In on a RISC-V Future

January 8, 2024

The state of RISC-V in China was discussed in a recent report released by the Jamestown Foundation, a Washington, D.C.-based think tank. The report, entitled "E Read more…

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

December 7, 2023

AMD and Nvidia are locked in an AI performance battle – much like the gaming GPU performance clash the companies have waged for decades. AMD has claimed it Read more…

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

March 18, 2024

Nvidia's latest and fastest GPU, codenamed Blackwell, is here and will underpin the company's AI plans this year. The chip offers performance improvements from Read more…

Eyes on the Quantum Prize – D-Wave Says its Time is Now

January 30, 2024

Early quantum computing pioneer D-Wave again asserted – that at least for D-Wave – the commercial quantum era has begun. Speaking at its first in-person Ana Read more…

GenAI Having Major Impact on Data Culture, Survey Says

February 21, 2024

While 2023 was the year of GenAI, the adoption rates for GenAI did not match expectations. Most organizations are continuing to invest in GenAI but are yet to Read more…

The GenAI Datacenter Squeeze Is Here

February 1, 2024

The immediate effect of the GenAI GPU Squeeze was to reduce availability, either direct purchase or cloud access, increase cost, and push demand through the roof. A secondary issue has been developing over the last several years. Even though your organization secured several racks... Read more…

Intel’s Xeon General Manager Talks about Server Chips

January 2, 2024

Intel is talking data-center growth and is done digging graves for its dead enterprise products, including GPUs, storage, and networking products, which fell to Read more…

Click Here for More Headlines

HPCwire is a registered trademark of Tabor Communications, Inc. Use of this site is governed by our Terms of Use and Privacy Policy.

Reproduction in whole or in part in any form or medium without express written permission of Tabor Communications, Inc. is prohibited.

Point #1: Selling Systems Built with a Proprietary “Tin”

Point #2: Cloud Offering Is Different & Compromised Compared to Proprietary “Tin” → Hybrid Cloud is Limited

Point #3: Limited Scale and Support for Mixed Workloads → Tons of Silos

Point #4: Limited Aggregate Performance & Single Client Performance → Unfit for GPU

Point #5: Data Backup and DR Are Performed by Others–or They’re Afterthoughts

A Bonus: Point #6

Conclusion

Leading Solution Providers

Off The Wire

Industry Headlines

April 24, 2024

April 23, 2024

April 22, 2024

April 19, 2024

Subscribe to HPCwire's Weekly Update!

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

AI Saves the Planet this Earth Day

Intel Announces Hala Point – World’s Largest Neuromorphic System for Sustainable AI

Empowering High-Performance Computing for Artificial Intelligence

Kathy Yelick on Post-Exascale Challenges

2024 Winter Classic: Texas Two Step

Anders Dam Jensen on HPC Sovereignty, Sustainability, and JU Progress

AI Saves the Planet this Earth Day

Kathy Yelick on Post-Exascale Challenges

Software Specialist Horizon Quantum to Build First-of-a-Kind Hardware Testbed

MLCommons Launches New AI Safety Benchmark Initiative

Exciting Updates From Stanford HAI’s Seventh Annual AI Index Report

Intel’s Vision Advantage: Chips Are Available Off-the-Shelf

The VC View: Quantonation’s Deep Dive into Funding Quantum Start-ups

Nvidia H100: Are 550,000 GPUs Enough for This Year?

Synopsys Eats Ansys: Does HPC Get Indigestion?

Intel’s Server and PC Chip Development Will Blur After 2025

Choosing the Right GPU for LLM Inference and Training

Comparing NVIDIA A100 and NVIDIA L40S: Which GPU is Ideal for AI and Graphics-Intensive Workloads?

Baidu Exits Quantum, Closely Following Alibaba’s Earlier Move

Google Addresses the Mysteries of Its Hypercomputer

How AMD May Get Across the CUDA Moat

Leading Solution Providers

Contributors

Tiffany Trader

Editorial Director

Douglas Eadline

Managing Editor

John Russell

Senior Editor

Kevin Jackson

Contributing Editor

Ali Azhar

Contributing Editor

Alex Woodie

Contributing Editor

Addison Snell

Contributing Editor

Drew Jolly

Assistant Editor

Meta’s Zuckerberg Puts Its AI Future in the Hands of 600,000 GPUs

China Is All In on a RISC-V Future

AMD’s Horsepower-packed MI300X GPU Beats Nvidia’s Upcoming H200

Nvidia’s New Blackwell GPU Can Train AI Models with Trillions of Parameters

Eyes on the Quantum Prize – D-Wave Says its Time is Now

GenAI Having Major Impact on Data Culture, Survey Says

The GenAI Datacenter Squeeze Is Here

Intel’s Xeon General Manager Talks about Server Chips

The Information Nexus of Advanced Computing and Data systems for a High Performance World

Share

Copy short link