Small Language Models and Vectors Will Bring AI into Reach of Mid-Range Businesses, says Chief Scientist, Aerospike
Naren Narendran says the shift from LLMs to SLMs and vectors is introducing focused AI models that demand less compute power and are more available to users
The cost-prohibitive nature of large language models (LLMs) is prompting a progression towards small language models (SLMs), resulting in solutions that decentralise and democratise their use, says Naren Narendran, Chief Scientist at Aerospike Inc., a real-time database leader. This propelling of the adoption of AI across a broader range of industries makes it accessible to mid-size organisations and enables data-driven hyper-personalised experiences.
While LLMs are exceptional at handling general purpose queries, they require huge amounts of compute and storage and are unaffordable to most companies whose pockets are simply not deep enough. However, the market is shifting more quickly than expected towards SLMs for specific domains with fewer parameters, requiring less processing power.
“LLMs are too large and unfocused for most business applications and are expensive to run, so companies are switching their attention to more economical and specific SLMs,” Narendran says. “In addition, vectors, which came to prominence because of LLMs, are beginning to be used for more classical applications, including predictions, fraud detection and personalisation. It’s no longer necessary to manually collect a set of features from vast amounts of data and feed them into a custom model. Instead a vector can be used to encode features past and present and provide hyper-personalised recommendations. This is expanding the horizons for semantic search in a way we could not have predicted just a few years ago.”
“Vectors have become part of the conversation in 2024 in the same way that LLMs did in 2023, and this is because LLMs can’t do the AI and ML job alone. There are other critical segments, including SLMs and vector databases, that bring together the full complement of tools. This evolution is allowing a new wave of companies, particularly in real-time industries, to adopt technology for split-second, automated decisions. Now they can more fully leverage AI for the business.”
Narendran thinks that this year will see more companies honing their AI strategies and adapting so they can use data in a more personalised way.
“As organisations bump up against the constraints of LLMs, they will try other options. If running ChatGPT is too expensive, they’ll turn to a SLM. Rather than depending on platforms to serve content based on behaviour collected from a user over the last year, they will turn to vectors that can dynamically encode and search on activity from an hour ago,” he said. “The landscape is changing quickly, and the search is on to use AI tools that can deliver business value. We anticipate an innovative year ahead.”