If you’ve looked at your tech stock screen lately, you may have realized that we’re out of the first phase of the AI hype cycle. A recent correction of 10% (or more) in the biggest tech stocks, including Amazon, Google and Microsoft, has everyone questioning expectations of an AI boom that went too far, too soon .
But it is still early. Building AI will be a long journey, with many twists and turns. The biggest cloud providers poured billions into new infrastructure to support the creation of large language models (LLMs) to power new applications and services, and the market heated up. The next phase of AI will be characterized by a more practical focus on return on investment (ROI), cost and privacy.
What I expect to unfold next is the evolution of private enterprise AI, a build targeted by enterprises to use hybrid cloud AI deployments to increase productivity and services – cost-effectively. After all, while Microsoft and Nvidia may rule the world of the stock market and the S&P 500 Index, they are merely suppliers to the rest of the world, which must turn their technology into profits.
ROI: The big question
We are not here to say that LLM and Big AI are a waste – they are part of the evolution of the market. In recent talks with practitioners, I’m finding a lot of interest in experimenting with a variety of AI models combined with private infrastructure. There is also growing interest in small language models (SLM) as a cost-effective way to build targeted productivity in enterprises with private data.
The cost of AI meme was catalyzed in June of this year, when Sequoia Capital partner David Cahn raised the question in an article titled “AI’s $600 Billion Question,” asking where the revenue will come from to cover the estimated $600 billion cost of AI.
Cahn made several salient points. While he remains bullish on AI for a long time, using the railway analogy (if you build it, the trains will come), he points to the huge maintenance costs of this infrastructure. And we still don’t know what services and business models will pay for the infrastructure. Returns will come, but capitalism can be a slow process.
So why should we be looking at private AI? In my conversations with CIOs and technology leaders, it’s clear that they have reservations about mass adoption of Big AI, due to questions about cost, security and economics. But there are ways to make AI work for the enterprise: Reduce the cost of infrastructure, feed it with your private data, and create proof points for productivity and revenue gains.
Private AI can be deployed at scale in flexible hybrid cloud infrastructure either on-premises or in the cloud, enabling enterprises to experiment with AI without betting backwards.
In an interesting example, network infrastructure provider Hedgehog explained at a recent tech pitch day how computer vision company Luminar deployed AI on its private infrastructure at one-sixth the cost of using an AI service in the cloud. It built this infrastructure to power its computer vision models with just 20 Nvidia L40 graphics processing units (GPUs).
Luminar, like most companies, was looking for a productivity gain that didn’t require huge risks. Many enterprises are confirming such wins, but they are doing so by experimenting with AI in a targeted way, using their own data and infrastructure.
How private RAGs can pay
There are tactical proof points that show better returns for enterprises with the use of private AI. These use cases can come in many forms, through custom applications built with partners on private infrastructure and SLM, or through commercial AI services fine-tuned with private data using Return Augmented Generation (RAG).
Audi has produced a great example of this, explaining how it refined a customer service chatbot using its own private data and RAG to reduce hallucinations and provide more relevant information. Audi partnered with software provider Storm Reply to build the chatbot. Training it with Audi data gave the chatbot the advantages of better security and more accuracy.
This doesn’t mean that hyperscaling services like OpenAI’s ChatGPT or Microsoft’s GitHub Copilot will be avoided – but they often need to be improved by combining them with private data, using RAG and other techniques. And my companies have their own price points in mind for implementing these services.
In another example, I recently spoke about private artificial intelligence with Faizan Mustafa, a former Toyota CIO who is now VP of AI and Enterprise IT at cloud networking provider Aviatrix. Mustafa told me how Aviatrix is using AI internally to drive productivity, based on its own private data.
Some of the AI applications Aviatrix is targeting include customer relationship management and customer support. Mustafa says the company is using its own data from customer interactions to identify pain points or ways to improve its products.
Mustafa also sees the benefit in helping developers. Aviatrix is using Microsoft’s GitHub Copilot to help developers generate documentation faster. “If they don’t have to write long documentation, it helps increase the capacity of software developers. We have seen the real value,” he said.
Examples like the one provided by Aviatrix are happening around the world in private experiments. Many of these involve the collection and processing of private enterprise data with RAG applications.
AI costs must be reduced
What will drive the private AI revolution? In short: cost reduction. The democratization of technology comes through the economy. Many enterprises see the scale and size of deploying hyperscalers costing tens of billions of dollars, and they think we just can’t afford that. They may also find mass-market LLMs too general. They need more targeted AI. As in the Luminar example, they can buy a few dozen AI chips and experiment with their data on a much smaller scale. They don’t need all the data in the world.
The costs of building private AI clouds can be significantly lower than using public cloud models. Consultancy Future Processing estimates that the costs of building an AI model for an enterprise can range from $5,000 to $500,000, depending on complexity. These numbers are slightly lower than what hyperscalers spend on AI infrastructure. For example, Meta is estimated to have spent $740 million to build its Llama 3 AI model. It’s also spending tens of billions on new AI chips and infrastructure.
For these reasons, we expect interest and experimentation in private AI to explode. A bunch of startups besides Storm Reply are targeting this area.
Mark McQuade, a former Hugging Face engineer, founded a company called Arcee AI, which targets SLMs for enterprises. McQuade, who is CEO of Arcee, told me that SLMs will help lower the cost of AI and democratize the technology for enterprises.
“We heard customers say they want to own their data, McQuade told me. “If you can get software and Gen AI running in the private cloud, that’s the holy grail.”
McQuade says Arcee has successfully built AI models running on a single Nvidia A100 GPU. The company has a flexible subscription business model that allows customers to use the software either on premises or as a service in the cloud. The company has 25 employees and has raised $24 million in venture capital.
So we need to stop worrying about whether AI will be “up or down” and think more about where it will go next. It is a process of applying technology to discover ROI at the most economical rate. To get to the next level, AI will need to be four things: secure, private, accurate and cost-effective. All enterprises are working on this, and returns have a high ceiling.