Neoclouds Bet Inference Will Outpace Training Revenue
CoreWeave and Nebius are securing multi-billion-dollar deals with Meta and Anthropic as neoclouds bet the inference market's growth will outstrip training, but the economic model remains unproven.
In this article
Over the span of 48 hours in early April, CoreWeave signed two deals that recast what a neocloud can become. On April 9, the company announced a $21 billion agreement with Meta to supply dedicated AI cloud capacity from 2027 through the end of 2032, the largest contract in its history. The following day, it disclosed a multiyear deal with Anthropic to run the Claude model family on its infrastructure. With those two signatures, CoreWeave could claim it now serves nine of the 10 leading AI model providers, Forbes reported.
Two weeks later, Nebius, the Amsterdam-headquartered competitor that re-emerged from the ashes of Yandex's European assets, reported first-quarter revenue of $399 million, a 684 percent increase from the same period a year earlier. The stock jumped 18 percent on the day, Barron's noted, narrowing the valuation gap with CoreWeave and turning the neocloud sector from a single-name story into something that looks more like a category. Together, CoreWeave and Nebius have now secured commitments from Meta alone totalling roughly $48 billion, an extraordinary concentration of demand in a class of cloud provider that barely existed three years ago.
Neoclouds emerged as a structural response to the GPU shortage that gripped the industry from late 2022 through much of 2024. The hyperscalers, AWS, Microsoft Azure, and Google Cloud, could not provision Nvidia H100 clusters fast enough to meet demand from AI labs, so a new tier of specialist providers stepped in. CoreWeave, originally a cryptocurrency mining operation, pivoted to GPU rentals and secured early and favourable access to Nvidia hardware. Lambda Labs, founded as a deep learning workstation company, did the same. Crusoe repurposed flare gas from oil fields to power modular data centres. Each of them built a business around one insight: that if you could get your hands on Nvidia's latest chips before the hyperscalers could, the market would pay a premium.
That premium has now been priced into commitments large enough to reshape balance sheets. CoreWeave ended 2025 with a contracted backlog of $66.8 billion, a figure that has only grown since the Meta and Anthropic deals. The company reaffirmed its full-year 2026 revenue guidance of $12 billion to $13 billion during its most recent earnings call, with CFO Nitin Agrawal telling analysts the firm was raising its 2026 exit run-rate floor to $18 billion. CoreWeave has become the fastest cloud provider in history to surpass $5 billion in annual revenue, driven by 168 percent year-on-year growth, a pace that would be remarkable in any sector and is nearly unheard of in enterprise infrastructure.
The deals with Meta and Anthropic reveal something subtler than raw demand, however. Meta's original agreement with CoreWeave, signed in 2025, was valued at roughly $14 billion. The new $21 billion commitment brings the total to approximately $35 billion and extends through 2032. Meta is building its own infrastructure at extraordinary scale, but it is also locking in external capacity years in advance of needing it. That behaviour is the tell. A company with Meta's engineering resources and capital access does not sign a seven-year capacity contract unless it believes, at an institutional level, that the compute requirements of frontier AI will outstrip even its own build-out.
The Inference Pivot
The neocloud revenue story to date has been a training story. Large language models consume staggering amounts of compute during pre-training and fine-tuning, and customers paid by the GPU-hour for access to clusters that could run for weeks or months. But the industry's centre of gravity is shifting. Inference, the process of running a trained model to generate output, is becoming the larger and more economically significant workload. Every query to ChatGPT or Claude consumes compute that must be provisioned somewhere, and as AI agents proliferate, each agent action can trigger multiple inference calls. The inference market is less spiky than training, more continuously utilised, and in theory offers higher margins for providers who can keep their clusters running at high utilisation.
This is not a prediction confined to neocloud marketing decks. Nvidia CEO Jensen Huang has publicly described inference as the larger long-term opportunity. The hyperscalers are building their own inference-optimised infrastructure, including Google's TPU v5p pods and Amazon's Trainium2 and Inferentia chips. And a new class of inference-specialist cloud platforms is emerging. In early April, DigitalOcean acquired Katanemo Labs and unveiled what it calls its "Agentic Inference Cloud," a platform purpose-built for production AI workloads rather than research and experimentation. The company claims AI-native startups on its platform are seeing 50 percent faster training cycles and a 40 percent latency reduction compared to general-purpose cloud infrastructure.
The inference opportunity changes the competitive dynamics between neoclouds and hyperscalers in ways that are not yet fully understood. Training workloads are portable: a lab can move a training run from one provider to another as long as the GPU architecture is consistent. Inference workloads are stickier. They require low-latency connectivity to end users, integration with application logic, and service-level agreements that meet production standards. If a neocloud can become the infrastructure layer under a widely-deployed AI application, the switching costs rise. That is the strategic logic behind CoreWeave's Anthropic deal: it is not just capacity, it is a relationship with one of the two frontier labs whose models power a large and growing share of enterprise AI workloads.
Fragile Economics, Concentrated Risk
For all the revenue momentum, the neocloud model carries structural risks that are only beginning to surface in analyst reports. CNBC reported in late April that Wall Street was growing bullish on the sector, but that McKinsey had warned the economics are fragile. Neoclouds emerged as "stopgaps to address the GPU shortage," and the question McKinsey raised is what happens when the shortage eases. If Nvidia's manufacturing capacity catches up to demand, the scarcity premium that neoclouds have been charging could compress. If the hyperscalers' own silicon, Google's TPUs, Amazon's Trainium, Microsoft's rumoured Athena chip, reaches parity with Nvidia GPUs for inference workloads, the neoclouds' value proposition narrows.
Customer concentration is the risk that compounds all the others. CoreWeave disclosed in its S-1 filing that a small number of customers account for a large majority of its revenue. The Meta relationship alone represents a significant share of the company's future contracted revenue. If Meta's own infrastructure build-out accelerates, or if the social media giant renegotiates terms, CoreWeave's revenue trajectory could shift materially. Nebius faces a similar dynamic. After Nvidia invested $2 billion in the company, the chipmaker became both a critical supplier and a significant shareholder, a dual relationship that gives Nvidia unusual influence over Nebius's cost structure and strategic direction.
The neoclouds are also making a bet that may prove expensive if they are wrong: they are, for the most part, refusing to diversify their silicon. The Information reported in its AI Infrastructure newsletter on May 6 that Nebius, Lambda, and CoreWeave have all declined to offer Google's TPUs in their cloud platforms, even as Google has been lobbying aggressively to expand the reach of its custom chips. The newsletter, authored by reporter Anissa Gardizy, noted that Google's push to get TPUs into third-party clouds is part of a broader effort to challenge Nvidia's dominance. For the neoclouds, saying no to TPUs is a bet that Nvidia's CUDA ecosystem and the customer preference it creates will remain more valuable than any discount Google might offer. It is also a bet that Nvidia will continue to allocate scarce next-generation chips to them.
That allocation question is not academic. Nvidia's Vera Rubin architecture, the successor to Blackwell, is expected to begin shipping in volume in 2027. CoreWeave's $21 billion Meta deal explicitly references Vera Rubin as part of the capacity commitment, according to The Next Web. The implication is that CoreWeave has secured promises from Nvidia for Rubin-class GPUs years before they are available to the general market. If those promises hold, the neocloud moat widens. If they do not, if Nvidia decides to prioritise direct sales to the hyperscalers or to its own DGX Cloud service, the neocloud business model faces a supply-side shock that no amount of contracted backlog can protect against.
We are reaffirming our full year guidance of $12 billion to $13 billion.Nitin Agrawal, CFO, CoreWeave, Q4 2025 earnings call
The debt load required to sustain this build-out is another variable the market has yet to price fully. Both CoreWeave and Nebius have issued substantial debt to finance their data centre expansion, using their GPU fleets and contracted revenue as collateral. In a low-interest-rate environment, the math works. In a higher-rate environment, or in a scenario where GPU values depreciate faster than expected because of a new architecture cycle, the debt servicing costs could consume margins that the equity story assumes will flow to the bottom line. Morningstar noted in its analysis of the CoreWeave-Meta deal that the company's "continuous high growth" thesis rests on execution that has not yet been tested across a full hardware cycle.
Energy is the constraint no one in the neocloud sector likes to discuss publicly, but it is surfacing in permitting documents and grid interconnection queues nonetheless. A WIRED investigation published in late April found that new gas-powered data centre projects linked to AI workloads could emit more than 129 million tons of greenhouse gases annually. Neoclouds, which tend to build in regions with available power rather than regions with clean power, are particularly exposed to regulatory risk as jurisdictions begin to impose emissions-related restrictions on new data centre construction. Crusoe, which built its brand on reducing flared gas, has a differentiated story on this front. CoreWeave and Nebius do not.
Then there is the question of what happens when the training-to-inference shift actually arrives at scale. Training contracts are large, lumpy, and paid upfront or on a committed basis. Inference revenue, by contrast, tends to be consumption-based and variable. A neocloud that signs a $21 billion training-heavy contract with Meta knows its revenue for the next several years. A neocloud that shifts toward inference workloads is exposed to the usage patterns of end-user applications, which can be seasonal, competitive, and subject to rapid changes in model architecture that alter the compute-per-query ratio. The inference cloud market may be larger, but it may also be less predictable.
The competitive landscape is expanding faster than the neoclouds' ability to differentiate from one another. Andromeda AI, a GPU rental startup, raised funding at a $1.5 billion valuation in March 2026, SiliconANGLE reported. DigitalOcean's inference-cloud acquisition puts it in direct competition with the neoclouds for the mid-market AI customer. The hyperscalers are not standing still: AWS is expanding its Trainium-based UltraCluster offerings, Microsoft is building out Azure's AI infrastructure with a mix of Nvidia and its own silicon, and Google Cloud is cutting TPU prices to attract inference workloads away from GPU-based competitors. The neoclouds are growing fast into a market that is growing even faster, but their share of that market is not guaranteed.
What separates the neoclouds that endure from those that do not will likely be the quality of their customer relationships rather than the quantity of their GPU inventory. CoreWeave's decision to lock in Meta through 2032 and to become the preferred infrastructure provider for Anthropic's Claude models suggests a strategy built on becoming indispensable to the biggest buyers of compute. Nebius, with its European footprint and its Nvidia-backed balance sheet, is positioning as the non-US alternative for labs and enterprises that want geographic diversification. Lambda, which has been quieter on the deal-making front, appears to be betting on developer experience and ease of use as its differentiator, a playbook that worked for Stripe and Twilio in earlier platform cycles.
The third quarter of 2026 will be revealing. CoreWeave's Q1 results, due in early May, will show whether the company's backlog is converting to revenue at the pace the market expects. Nebius will face its own test: can it sustain 600-plus percent growth as the base effect catches up? And the hyperscalers will report their cloud revenue for the April-to-June period, with AI-related growth rates that either validate or undermine the neocloud thesis that specialist providers can take share from the incumbents. The cheapest signal to watch may be Nvidia's own allocation announcements. If Vera Rubin's initial shipments are disclosed as going disproportionately to the neoclouds, the scarcity premium holds. If the hyperscalers dominate the allocation, the neoclouds' competitive advantage narrows overnight.
The neocloud bet is, at bottom, a bet that the AI infrastructure market is large enough to sustain a second tier of providers beneath the hyperscalers, and that the inference era will favour specialists who can optimise for a narrower set of workloads. That bet has paid off handsomely for early investors. Whether it pays off for the public-market investors who are buying CoreWeave and Nebius shares at premium multiples will depend on factors the companies do not fully control: Nvidia's chip allocation strategy, hyperscaler silicon roadmaps, the pace of the training-to-inference transition, and the regulatory treatment of energy-intensive data centre construction. Each of those variables will produce data points in the coming quarters. The story is no longer whether neoclouds exist. It is whether they can build businesses durable enough to survive the cycle that made them.