Despite intense AI arms race, we’re in for a multi-modal future
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Every week — sometimes every day—a new state-of-the-art AI model is born to the world. As we move into 2025, the pace at which new models are being released is dizzying, if not exhausting. The curve of the rollercoaster is continuing to grow exponentially, and fatigue and wonder have become constant companions. Each release highlights why this particular model is better than all others, with endless collections of benchmarks and bar charts filling our feeds as we scramble to keep up.
Eighteen months ago, the vast majority of developers and businesses were using a single AI model. Today, the opposite is true. It is rare to find a business of significant scale that is confining itself to the capabilities of a single model. Companies are wary of vendor lock-in, particularly for a technology which has quickly become a core part of both long-term corporate strategy and short-term bottom-line revenue. It is increasingly risky for teams to put all their bets on a single large language model (LLM).
But despite this fragmentation, many model providers still champion the view that AI will be a winner-takes-all market. They claim that the expertise and compute required to train best-in-class models is scarce, defensible and self-reinforcing. From their perspective, the hype bubble for building AI models will eventually collapse, leaving behind a single, giant artificial general intelligence (AGI) model that will be used for anything and everything. To exclusively own such a model would mean to be the most powerful company in the world. The size of this prize has kicked off an arms race for more and more GPUs, with a new zero added to the number of training parameters every few months.
We believe this view is mistaken. There will be no single model that will rule the universe, neither next year nor next decade. Instead, the future of AI will be multi-model.
Language models are fuzzy commodities
The Oxford Dictionary of Economics defines a commodity as a “standardized good which is bought and sold at scale and whose units are interchangeable.” Language models are commodities in two important senses:
The models themselves are becoming more interchangeable on a wider set of tasks;
The research expertise required to produce these models is becoming more distributed and accessible, with frontier labs barely outpacing each other and independent researchers in the open-source community nipping at their heels.
But while language models are commoditizing, they are doing so unevenly. There is a large core of capabilities for which any model, from GPT-4 all the way down to Mistral Small, is perfectly suited to handle. At the same time, as we move towards the margins and edge cases, we see greater and greater differentiation, with some model providers explicitly specializing in code generation, reasoning, retrieval-augmented generation (RAG) or math. This leads to endless handwringing, reddit-searching, evaluation and fine-tuning to find the right model for each job.
And so while language models are commodities, they are more accurately described as fuzzy commodities. For many use cases, AI models will be nearly interchangeable, with metrics like price and latency determining which model to use. But at the edge of capabilities, the opposite will happen: Models will continue to specialize, becoming more and more differentiated. As an example, Deepseek-V2.5 is stronger than GPT-4o on coding in C#, despite being a fraction of the size and 50 times cheaper.
Both of these dynamics — commoditization and specialization — uproot the thesis that a single model will be best-suited to handle every possible use case. Rather, they point towards a progressively fragmented landscape for AI.
Multi-modal orchestration and routing
There is an apt analogy for the market dynamics of language models: The human brain. The structure of our brains has remained unchanged for 100,000 years, and brains are far more similar than they are dissimilar. For the vast majority of our time on Earth, most people learned the same things and had similar capabilities.
But then something changed. We developed the ability to communicate in language — first in speech, then in writing. Communication protocols facilitate networks, and as humans began to network with each other, we also began to specialize to greater and greater degrees. We became freed from the burden of needing to be generalists across all domains, to be self-sufficient islands. Paradoxically, the collective riches of specialization have also meant that the average human today is a far stronger generalist than any of our ancestors.
On a sufficiently wide enough input space, the universe always tends towards specialization. This is true all the way from molecular chemistry, to biology, to human society. Given sufficient variety, distributed systems will always be more computationally efficient than monoliths. We believe the same will be true of AI. The more we can leverage the strengths of multiple models instead of relying on just one, the more those models can specialize, expanding the frontier for capabilities.
An increasingly important pattern for leveraging the strengths of diverse models is routing — dynamically sending queries to the best-suited model, while also leveraging cheaper, faster models when doing so doesn’t degrade quality. Routing allows us to take advantage of all the benefits of specialization — higher accuracy with lower costs and latency — without giving up any of the robustness of generalization.
A simple demonstration of the power of routing can be seen in the fact that most of the world’s top models are themselves routers: They are built using Mixture of Expert architectures that route each next-token generation to a few dozen expert sub-models. If it’s true that LLMs are exponentially proliferating fuzzy commodities, then routing must become an essential part of every AI stack.
There is a view that LLMs will plateau as they reach human intelligence — that as we fully saturate capabilities, we will coalesce around a single general model in the same way that we have coalesced around AWS, or the iPhone. Neither of those platforms (or their competitors) have 10X’d their capabilities in the past couple years — so we might as well get comfortable in their ecosystems. We believe, however, that AI will not stop at human-level intelligence; it will carry on far past any limits we might even imagine. As it does so, it will become increasingly fragmented and specialized, just as any other natural system would.
We cannot overstate how much AI model fragmentation is a very good thing. Fragmented markets are efficient markets: They give power to buyers, maximize innovation and minimize costs. And to the extent that we can leverage networks of smaller, more specialized models rather than send everything through the internals of a single giant model, we move towards a much safer, more interpretable and more steerable future for AI.
The greatest inventions have no owners. Ben Franklin’s heirs do not own electricity. Turing’s estate does not own all computers. AI is undoubtedly one of humanity’s greatest inventions; we believe its future will be — and should be — multi-model.
Zack Kass is the former head of go-to-market at OpenAI.
Tomás Hernando Kofman is the co-Founder and CEO of Not Diamond.
DataDecisionMakers
Welcome to the VentureBeat community!
DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.
If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.
You might even consider contributing an article of your own!
Read More From DataDecisionMakers