1
1
The global business landscape is currently navigating a period of significant financial volatility, with private company defaults surging to an alarming 9.2%—a rate not seen in years. This instability has cast a shadow over various sectors, including the rapidly evolving artificial intelligence industry. In response to these growing concerns, prominent venture capital firm Lux Capital recently issued a stark warning to companies heavily reliant on AI: secure compute capacity commitments in writing. The firm highlighted that with financial instability now rippling through the intricate AI supply chain, informal "handshake agreements" are no longer sufficient to guarantee critical computational resources, potentially leaving innovative AI ventures vulnerable to unexpected disruptions.
However, an alternative strategy is gaining considerable traction, offering a potential antidote to the risks associated with external compute infrastructure: the adoption of smaller AI models designed to run directly on a user’s own device. This approach fundamentally shifts the paradigm by eliminating the reliance on remote data centers, cloud providers, and the inherent counterparty risks that come with third-party services. As these compact AI models demonstrate increasingly sophisticated capabilities, becoming "good enough" for practical applications, they present a compelling option for enhanced resilience and autonomy. Stepping forward as a key innovator in this burgeoning field is Multiverse Computing, a Spanish startup poised to redefine AI deployment.
While Multiverse Computing has historically maintained a more subdued profile compared to some of its higher-visibility peers, the escalating demand for AI efficiency and robust operational independence is rapidly altering its trajectory. The company has distinguished itself by successfully compressing sophisticated models from leading AI research laboratories, including industry giants such as OpenAI, Meta, DeepSeek, and Mistral AI. This technical prowess has culminated in the launch of two pivotal offerings: a dedicated application designed to showcase the capabilities of its highly optimized, compressed models, and an API portal. This API gateway serves as a crucial interface, providing developers with direct access to these advanced models and enabling them to integrate and build innovative solutions on a much broader scale.
The aforementioned CompactifAI app, which lends its name to Multiverse’s proprietary quantum-inspired compression technology, functions as an advanced AI chat tool, akin to popular platforms like ChatGPT or Mistral’s Le Chat. Users can pose questions and receive intelligent responses from the embedded model. The core differentiator, however, lies in its foundational technology: the app integrates Gilda, a model meticulously engineered to be so compact that it can operate entirely locally and offline, according to the company. This capability promises an unprecedented level of privacy and operational independence for end-users, as data remains on their device and processing occurs without the need for an internet connection.
For end-users, the CompactifAI app offers a tangible glimpse into the future of "AI on the edge," where personal data remains securely on their devices and functionality is maintained even in the absence of network connectivity. This represents a significant leap in data privacy and operational resilience. Nevertheless, this innovative approach comes with a practical caveat: the user’s mobile device must possess adequate RAM and storage capacity to support the local execution of the Gilda model. Many older iPhone models, for instance, may not meet these specific hardware requirements. In such scenarios, the app intelligently and automatically switches to cloud-based models via an API, thereby mitigating performance issues. This automatic routing mechanism is managed by a sophisticated system Multiverse has aptly named Ash Nazg, a reference that will resonate with fans of J.R.R. Tolkien’s "The Lord of the Rings." However, when the app routes requests to the cloud, it inherently sacrifices its primary advantage of local data processing and the enhanced privacy it affords.
These hardware-dependent limitations suggest that the CompactifAI app, in its current iteration, may not yet be fully optimized for broad mass consumer adoption. Data from Sensor Tower indicates that the app registered fewer than 5,000 downloads in the past month, aligning with the understanding that its strategic intent may not be solely focused on immediate widespread consumer uptake. Instead, Multiverse Computing’s true strategic target market lies squarely with businesses and enterprise developers. To serve this critical segment, the company is today unveiling a self-serve API portal. This portal grants developers and enterprises direct, unmediated access to Multiverse’s portfolio of compressed models, circumventing the need for intermediaries like the AWS Marketplace.

Enrique Lizaso, CEO of Multiverse Computing, underscored the significance of this strategic move, stating, "The CompactifAI API portal now gives developers direct access to compressed models with the transparency and control needed to run them in production." This emphasis on transparency and control is paramount for enterprise clients, enabling them to seamlessly integrate these efficient AI models into their existing workflows, monitor performance with precision, and manage computational resources effectively.
A standout feature of the new API portal is its real-time usage monitoring capability, a design choice that is far from accidental. Beyond the inherent advantages of deploying AI on the edge—such as reduced latency and enhanced privacy—the prospect of significantly lower compute costs is a primary driver for enterprises actively exploring smaller models as a viable alternative to larger, more resource-intensive language models (LLMs). Real-time monitoring allows businesses to meticulously track their consumption and optimize their AI expenditures, a crucial factor in the current economic climate.
Furthermore, the capabilities of smaller models are no longer as restricted as they once were, with significant advancements narrowing the performance gap with their larger counterparts. This week, for instance, Mistral AI, a prominent player in the AI space, enhanced its small model family with the introduction of Mistral Small 4. The company asserts that this new iteration is meticulously optimized for a diverse range of applications, including general chat functionalities, advanced coding tasks, complex agentic operations, and sophisticated reasoning capabilities. Concurrently, the French company also launched Forge, an innovative system designed to empower enterprises to construct highly customized models. This includes the ability to develop tailored small models, allowing businesses to precisely define the performance trade-offs that best align with their specific use cases and operational requirements.
Multiverse Computing’s recent achievements similarly indicate a rapidly closing gap between compressed and large language models. The company’s latest compressed offering, HyperNova 60B 2602, is ingeniously built upon gpt-oss-120b—an OpenAI model whose foundational code is publicly accessible. Multiverse claims that HyperNova 60B 2602 not only delivers demonstrably faster responses but also achieves this at a significantly lower cost than the original model from which it was derived. This dual advantage is particularly impactful for agentic coding workflows, where AI systems autonomously execute intricate, multi-step programming tasks, promising substantial improvements in efficiency and development cycles.
The endeavor to create AI models that are sufficiently small to operate on mobile devices while retaining substantial utility presents a formidable technical challenge. Apple Intelligence, for instance, addressed this complexity by employing a hybrid architecture that seamlessly combines an on-device model with a cloud-based counterpart. While Multiverse’s CompactifAI app also incorporates the ability to route requests to gpt-oss-120b via API when local processing is not feasible, its fundamental objective transcends mere cost savings. The app primarily serves to demonstrate that local models, such as Gilda and its future successors, offer profound advantages that extend beyond economic efficiency.
For professionals operating in critical and sensitive fields, an AI model capable of running locally and without requiring a constant cloud connection provides unparalleled levels of privacy and operational resilience. This ensures that sensitive data remains within controlled environments and that AI functionalities are accessible even in compromised or disconnected settings. However, the most transformative value lies in the entirely new business use cases that this technology unlocks. This includes the seamless integration of AI into devices like drones, satellites, and various other remote or specialized settings where consistent network connectivity cannot be guaranteed. Such applications could revolutionize industries from defense and aerospace to remote sensing and infrastructure management.
Multiverse Computing has already established a robust client base, serving more than 100 global customers, including prestigious institutions such as the Bank of Canada, industrial powerhouse Bosch, and energy giant Iberdrola. This impressive roster of clients underscores the trust and confidence placed in Multiverse’s technology by major players across diverse sectors. The company’s continued expansion of its customer base is expected to further bolster its appeal to investors and unlock additional funding opportunities. Following a successful Series B funding round last year, which saw the company raise a substantial $215 million, Multiverse Computing is now rumored to be in the process of securing a fresh funding round of €500 million, potentially at a valuation exceeding €1.5 billion. This significant financial interest highlights the market’s strong belief in Multiverse’s innovative approach and its potential to reshape the future of AI deployment.