It’s a truism of data analytics: when it comes to data, more is generally better. But the explosion of AI-powered large language models (LLMs) like ChatGPT and Google Gemini (formerly Bard) challenges this conventional wisdom.
As organizations in every industry rush to enrich their own private data sets with LLMs, the quest for more and better data is unfolding at a scale never seen before, stretching the limits of present-day infrastructure in new and disruptive ways. Yet the sheer scale of the data sets ingested by LLMs raises an important question: Is more data really better if you don’t have the infrastructure to handle it?
Training LLMs on internal data poses many challenges for data and development teams. This entails the need for considerable compute budgets, access to powerful GPUs (graphics processing units), complex distributed compute techniques, and teams with deep machine learning (ML) expertise.
Outside of a few hyperscalers and tech giants, most organizations today simply don’t have that infrastructure readily available. That means they are forced to build it themselves, at great cost and effort. If the required GPUs are available at all, cobbling them together with other tools to create a data stack is prohibitively expensive. And it’s not how data scientists want to spend their time.
Three Pitfalls to Avoid
In the quest to pull together or bolster their infrastructure so that it can meet these new demands, what’s an organization to do? When setting out to train and tune LLMs against their data, what guideposts can they look for to make sure their efforts are on track and that they’re not jeopardizing the success of their projects? The best way to identify potential risks is to ask the following three questions:
1. Focusing too much on building the stack vs. analyzing the data
Time spent assembling a data stack is time taken away from the stack’s reason for being: analyzing your data. If you find yourself doing too much of it, look for a platform that automates the foundational elements of building your stack so your data scientists can focus on analyzing and extracting value from the data. You want to be able to pick the components, then have the stack generated for you so you can get to insights quickly.
2. Finding GPUs needed to process the data
Remember when all the talk was about managing cloud costs through multi-cloud solutions, cloud portability, and so on? Today, there’s an analogous conversation on the issue of GPU availability and right-sizing. What is the right GPU for your LLM, who provides it and at what hourly cost to analyze your data, and where do you want to run your stack? Making the right decisions requires balancing multiple factors, such as your computational needs, budget constraints, and future requirements. Look for a platform that is architected in a way that gives you the choice and flexibility to use the GPUs that fit your project and to run your stack wherever you choose, be it on different cloud providers or on your own hardware.
3. Running AI workloads against your data cost-effectively
Finally, given the high costs involved, no one wants to pay for idle resources. Look for a platform that offers ephemeral environments, which allow you to spin up and spin down your instances so you only pay when you’re using the system, not when it’s idle and waiting.
Déjà-vu All Over Again?
In many ways, data scientists seeking to extract insights from their data using LLMs face a similar dilemma to the one software developers faced in the early days of DevOps. Developers who just wanted to build great software had to take on the running of operations and their own infrastructure. That “shift left” eventually led to bottlenecks and other inefficiencies for dev teams, which ultimately hindered many organizations from reaping the benefits of DevOps.
This issue was somewhat solved by DevOps teams (and now increasingly platform engineering teams) tasked with building platforms that developers could code on top of. The idea was to recast developers as DevOps’ or PE teams’ customers, and in doing so free them up to write great code without having to worry about infrastructure.
The lesson for organizations caught up in the rush to gain new insights from their data by incorporating the latest LLMs is this: Don’t saddle your data scientists with infrastructure worries.
Let Data Scientists Be Data Scientists
In the brave new world opened up by LLMs and the next-gen GPUs that can handle data-intensive AI workloads, let your data scientists be data scientists. Let them use these astounding innovations to test hypotheses and gain insights that can help you train and optimize your data models and drive value that can help differentiate your organization in the market and lead to the creation of new products.
To navigate this golden age of opportunity effectively, choose a platform that helps you focus on your differentiators while automating the foundational elements of building your AI stack. Look for a solution that gives you choice and flexibility in GPU usage and where you run your stack. Lastly, find an option that offers ephemeral environments that allow you to optimize costs by paying only for the resources you use. Embracing these key principles will empower you to solve the infrastructure dilemma posed by today’s Gen AI gold rush—and position your organization for success.
About the author: Erik Landerholm is a seasoned software engineering leader with over 20 years of experience in the tech industry. As the co-founder of Release.com and a Y Combinator alum from the summer of 2009, Erik has a rich history of entrepreneurial success. His previous roles include co-founder of CarWoo! and IMSafer, as well as Senior Vice President and Chief Architect at TrueCar.
Related Items:
Why A Bad LLM Is Worse Than No LLM At All
LLMs Are the Dinosaur-Killing Meteor for Old BI, ThoughtSpot CEO Says
GenAI Doesn’t Need Bigger LLMs. It Needs Better Data