Liquid Cooling Market to Hit 9.5B as AI Racks Pass 100 kW

The global data center liquid cooling market will reach $29.5 billion by 2033, expanding at a compound annual growth rate of 20.1% from 2026 onward, according to a Grand View Research report published June 9. The figure, distributed by PR Newswire and covered by TMCnet, captures more than a market-sizing exercise. It marks the moment cooling moved from a facilities footnote to a line item large enough to reshape per-token economics for every model provider running dense GPU clusters.

The driver is straightforward arithmetic. A single NVIDIA H100 node configured for large-model inference at batch size 1 draws roughly 700 W. A rack of 72 H100s lands near 50 kW. The Grace Blackwell GB200 NVL72, sampled this year, pushes a single rack past 120 kW. Traditional air cooling tops out around 35 to 40 kW per rack. Beyond that, the cost of moving enough air through the facility rises faster than the thermal load itself. The industry crossed that line sometime in mid-2024. Since then, every new deployment announcement with a plausible AI workload has been liquid-cooled by default.

TechRepublic reported on June 18 that many data centers now "struggle to get enough power into the facility to run AI applications," a formulation that captures the dual bottleneck: total site power and per-rack thermal density both constrain cluster size. The cooling subsystem typically eats 30% to 35% of a data center's total energy budget, according to figures cited by New Atlas in May. For a 100-megawatt campus running at 80% utilization, that is $26 million to $31 million in annual electricity costs at the U.S. industrial average of 7.5 cents per kilowatt-hour, before any compute cycles are sold.

The cooling cost attaches to every token. A frontier model serving inference at 100 tokens per second on an 8-GPU H100 node consumes roughly 5.6 kW of power, of which approximately 1.7 to 2.0 kW goes to cooling at current air-cooled efficiency levels. At the hyperscale cloud on-demand price of $2.50 per million output tokens for a GPT-4-class model, the cooling line item per million tokens lands between $0.12 and $0.18, or roughly 5% to 7% of revenue. That is not a rounding error. For a model provider processing 1 trillion output tokens per month, cooling alone costs $120,000 to $180,000 monthly, and those numbers are for air-cooled deployments that will not support the next generation of hardware.

Liquid cooling changes these economics, but not uniformly. Direct-to-chip cold-plate systems, the dominant architecture in new AI deployments, can cut cooling energy to around 15% of total site power. Two-phase immersion systems push it closer to 5%. The New Atlas report on copper-plate cooling research suggested a path to reducing cooling from 30% to 35% of total consumption down to roughly 1.1%, a reduction of over 95% under lab conditions. Whether that translates to production hyperscale deployments at reasonable cost remains unproven. But the direction is unambiguous: the spread between best-in-class liquid cooling and legacy air cooling on a per-megawatt basis is now wide enough that new AI clusters pencil out only with the former.

AI is creating a new and largely overlooked strain on something fundamental to health., Sten H. Vermund and Patricia J. Kissinger, STAT News First Opinion, June 18, 2026

The energy conversation is increasingly inseparable from water and public health. In a STAT News opinion piece published June 18, Sten H. Vermund and Patricia J. Kissinger argued that AI data centers impose a public health burden through electricity demand that strains grids, water consumption that competes with residential and agricultural use, and air-quality degradation from backup diesel generation. The Associated Press reported on June 3 that a United Nations University study found the environmental footprint of data centers already rivals that of some countries when measured across energy, water, and materials.

The UN finding reframes the debate. A data center campus consuming 500 MW of power and 1.5 million gallons of water per day for evaporative cooling, located in a water-stressed region like Arizona or northern Virginia, has a quantifiable public health externality that is not priced into any GPU-hour invoice. New Jersey Governor Mikie Sherrill unveiled a statewide strategy in late May to rein in data center growth specifically because of the impact on resident electricity bills, NBC New York reported. The NEMA, ASHRAE, and PNNL consortium released a new performance framework for AI data center efficiency on June 12, targeting thermal management and resilience standards that did not exist when most current facilities were designed.

Startups are raising serious capital to close the cooling gap. SiliconANGLE reported on June 2 that ZutaCore closed a $100 million Series C to scale its waterless two-phase liquid cooling technology, with backing from Samsung Ventures, Mitsubishi Electric, and Carrier Ventures. The company's HyperCool system uses a dielectric refrigerant that boils on contact with the chip, removing heat without water or a secondary loop. At scale, a waterless system decouples the data center siting decision from water availability, which is precisely the constraint that has made new permits difficult in Dublin, Loudoun County, and Phoenix.

The extreme end of the decoupling argument is orbit. CNBC reported on June 21 that SpaceX is advancing orbital AI data center plans, with early deployments slated for 2027. Quartz noted on June 17 that startups are racing to build orbital capacity before Big Tech locks in terrestrial sites. The economic case for space-based AI compute, however, remains unproven. Launch costs, radiation hardening, and latency for earthbound inference requests each introduce cost multipliers that make even the most expensive terrestrial liquid cooling look cheap. CNBC's assessment was blunt: the economic case is "questionable."

The terrestrial land grab continues anyway. MarketBeat reported on June 19 that Rackspace Technology and AMD signed a massive agreement to deploy AI infrastructure, part of a pattern of second-tier cloud providers positioning themselves as alternatives to the hyperscalers for AI workloads. Oilprice.com noted on June 16 that a $130 million company signed a 15-year, $2.6 billion lease for power infrastructure, a validation that electricity access, not GPU availability, has become the binding constraint on the AI boom. Bloom Energy's 2026 survey, cited by Crypto Briefing on June 15, projected AI data center capacity rising from 13% to 23% of total U.S. data center load by 2030, with 55 GW of new capacity coming online.

What this means for the per-token economy

The margin structure of the inference stack is being rewritten by cooling economics. A hyperscaler paying $0.075 per kWh for electricity and running direct-to-chip liquid cooling at 15% overhead can deliver tokens at a cooling cost of roughly $0.06 to $0.09 per million output tokens, roughly half the cost of an air-cooled equivalent. The savings accrue to the infrastructure owner, not the model provider renting cloud capacity at list price. But for vertically integrated labs running their own silicon and their own data halls, the cooling efficiency delta between air and liquid translates into a direct margin improvement of 4 to 7 percentage points on inference revenue. At the scale of 10 trillion tokens per month, that is $5 million to $8 million per month in pure operating-cost savings.

The question of who captures the cooling dividend is unresolved. If cloud providers invest in liquid cooling and keep on-demand pricing flat, they absorb the margin improvement. If model providers build owned-and-operated inference clusters, they capture it. The third scenario, and the one most consistent with how compute markets have evolved historically, is that cooling efficiency gains are competed away within 18 to 24 months as lower per-token costs flow through to lower prices. The Grand View Research forecast of a 20.1% CAGR through 2033 implies the cooling supply chain expects sustained investment regardless of which party books the margin.

There is also a compute-architecture question that cooling numbers expose. A GB200 NVL72 rack running batch size 32 inference produces roughly 4 times the throughput of an equivalent H100 footprint, but its 120 kW thermal load means the cooling cost per token drops only if the facility is liquid-cooled from the slab up. Retrofitting an air-cooled facility for 120 kW per rack is often more expensive than building new. This creates a bifurcation: inference workloads that can tolerate batch size 1 on last-generation hardware can still run profitably on air-cooled infrastructure, while high-throughput, low-latency inference on current-generation hardware requires liquid cooling from day one. The depreciation schedules on these two classes of data center are diverging.

What to watch

The liquid cooling market's trajectory from $29.5 billion in 2033 back to today implies roughly $8 billion to $9 billion in cooling infrastructure spending this year, growing to $12 billion by 2028. Those are forecasts, not invoices. The number to track is not the market size but the per-rack cooling cost at batch size 1 versus batch size 32, and the spread between new-build liquid and retrofit air. When that spread appears in the public cloud pricing pages as a line item, or when a large model provider discloses cooling cost in a regulatory filing, the numbers will move from analyst decks to the per-token invoice. That is when the cooling conversation stops being an infrastructure story and becomes a cost-of-goods-sold story for every AI company shipping tokens.

The $2.6 billion, 15-year power lease flagged by Oilprice.com, the $100 million ZutaCore raise, and the NEMA-ASHRAE framework all point in the same direction. Cooling is no longer an engineering afterthought. It is a pricing input. And pricing inputs get negotiated.

Liquid Cooling Market to Hit $29.5B as AI Racks Pass 100 kW

What this means for the per-token economy

What to watch

Read next

What this means for the per-token economy

What to watch

Read next

Get the Daily Briefbefore your first meeting.

Get the Daily Brief
before your first meeting.