The AI Data Center Obsolescence Crisis: Why Physics is Popping the Bubble
- Tony Grayson
- Nov 20, 2025
- 10 min read
Updated: Feb 23
By Tony Grayson | Independent Strategic Advisor | Top 10 Data Center Influencer | Former CO USS Providence (SSN-719) | Former SVP Oracle · AWS · Meta
Published: November 20, 2025 | Updated: February 22, 2026 | Verified: February 22, 2026

TL;DR — The Physics Don't Care About the Business Plan
120kW racks cannot be cooled with air. Most facilities can't support them without expensive retrofits.
600kW roadmap (Nvidia 2027) means today's 120kW-designed facilities face a second wave of obsolescence.
3,000 lb racks exceed the weight ratings of most raised floors — structural obsolescence is real.
CoreWeave: $18.8B debt raised against hardware that depreciates in 2-3 years while booked for 6.
Google TPU: 30-44% lower TCO than Nvidia GB200 for inference — hyperscalers don't pay Nvidia's margin.
Key Concepts
AI Data Center Obsolescence: When existing facilities become structurally, thermally, or electrically unable to support current AI hardware. Driven by NVIDIA's 18-month refresh cycle and three simultaneous mismatch vectors: power density, cooling, and weight.
Direct-to-Chip Liquid Cooling (DLC): Pumps coolant directly to GPU heat sinks. Mandatory above ~30-40kW per rack. Requires CDUs, rack-level water loops, and often concrete trenching. Most colocation facilities don't have it. Retrofit cost: $1-5M per megawatt.
OpEx Obsolescence: Hardware that still functions but costs 10x more to operate than the next generation. If Blackwell delivers the same inference workload for 1/10th the power cost of Hopper, Hopper-based infrastructure is OpEx-obsolete instantly.
GPU Depreciation Mismatch: Accounting depreciation over 6 years vs economic life of 2-3 years. Jonathan Ross (Groq CEO) argues 1-2 year amortization schedules are correct. The gap represents potential massive write-downs on the industry balance sheet.
NVIDIA Vendor Financing: Circular structure where NVIDIA invests in customers to buy its own chips. Creates artificial demand that functions as a bridge during growth — and as an anchor when organic revenue slows.
Stranded Asset: A facility or hardware asset whose economic value has been destroyed by obsolescence before its expected financial life ends. The central risk in AI data center investment today.
COMMANDER'S INTENT:
In the submarine force, we had a distinct difference between "hearing a transient noise" and "confirming a contact." For two years, we heard noise about an AI bubble. Today, we have a confirmed contact. The receipts are in: 120kW racks that facilities can't support. $18.8B in CoreWeave debt. Depreciation schedules that assume 6-year asset life for hardware that goes OpEx-obsolete in 24 months.
You can't argue with physics. And the physics are not on the side of the current financing model.
— Tony Grayson, Former CO USS Providence (SSN-719) | Former SVP Oracle, AWS, Meta
In my years managing physical infrastructure at Oracle, building AWS's global design and engineering practice, and running operations at Meta, I've seen technology cycles come and go.
For the past year, the industry has asked, "Is this an AI bubble?" It was a fair question when we were dealing with projections and PowerPoint decks.
But in the last quarter, the narrative shifted from "Is it?" to "Why it is."
We moved from forecasting risks to measuring them. The lag time between the hype and the physics has finally closed, and we now have the receipts:
The Physics Bill Came Due: We aren't just predicting cooling limits anymore; we are seeing facilities rejected because they physically cannot support 120kW+ liquid-cooled racks. This is AI Data Center Obsolence and it will increase with NVIDIA's 18 month refresh cycle.
The Debt is Visible: We aren't guessing about vendor financing; we can see the multi-billion dollar debt loads on balance sheets that far outstrip organic revenue.
The Depreciation Math Broke: We aren't theorizing about asset lifespan; we are watching 2-year product cycles render standard 6-year depreciation schedules mathematically impossible.
In the submarine force, we had a distinct difference between "hearing a transient noise" and "confirming a contact." For two years, we heard noise. Today, we have a confirmed contact. The data is in, and it tells us that the current financing and infrastructure models are unsustainable.
Here is the diagnosis of why the burst is inevitable.
"In the submarine force, we had a distinct difference between 'hearing a transient noise' and 'confirming a contact.' For two years, we heard noise. Today, we have a confirmed contact."
— Tony Grayson, Former CO USS Providence (SSN-719)
The consensus in late 2025:
CoreWeave's IPO validates the AI infrastructure investment thesis. NVIDIA's product cycle is a tailwind. Data center construction backlogs prove demand is real. The build-out is inevitable.
Tony Grayson's position: the demand is real. The physics make the current business model impossible.
The distinction matters. CoreWeave's $18.8B debt load isn't evidence of demand — it's evidence of the gap between demand and the economics of serving it. Every new NVIDIA generation that delivers 10x performance creates a new wave of stranded assets in the generation before it. Companies are signing 10-year leases on facilities built for 6-10kW racks and calling it an AI strategy.
The contact is confirmed. The question isn't whether the bubble exists. It's who's holding the bag when it clears.
1. AI Data Center Obsolescence: The GB200 & Liquid Cooling Crisis
This is where my background in nuclear engineering and data center design becomes relevant. I've spent my career managing thermal loads, and what I'm seeing now violates the principles of sustainable infrastructure design. Hence, the AI Data Center Obsolescence
The Power Density Problem Today's Nvidia GB200 NVL72 systems require approximately 120–132kW per rack, vastly outstripping the traditional 6–10kW standard. However, Nvidia's roadmap targets approximately 600kW per cabinet by 2027.
The Reality: Moving to 600kW isn't an upgrade; it is a complete infrastructure replacement. A facility built today for 120kW cannot simply "scale up" to 600kW without gutting its entire cooling and electrical backbone.
The Weight & Structural Limits A GB200 NVL72 configuration weighs approximately 3,000 lbs. Most legacy raised floors support only 1,500–2,000 lbs per tile. You cannot roll these racks onto a standard floor without significant structural reinforcement. Facilities built just 3 years ago are facing structural obsolescence.
The Cooling Problem: The $5 Million "Oops" You cannot cool 120–600kW racks with air. It requires Direct-to-Chip (DLC) liquid cooling, necessitating CDUs (Coolant Distribution Units) and rack-level water loops. Most existing colocation footprints lack this infrastructure. Industry estimates peg the cost of retrofitting a standard 10MW facility for Direct-to-Chip cooling at $1–5 million per megawatt in CapEx, often involving trenching concrete slabs. For a 50MW facility, you are looking at a massive unplanned bill just to prepare the floor for the rack.
"You can't argue with physics. You cannot cool 600kW racks with infrastructure designed for 20kW. You cannot load 6,000 lbs onto a floor rated for 2,000 lbs."
— Tony Grayson, Former SVP Oracle · AWS · Meta | Top 10 Data Center Influencer
2. Circular Financing Risks: The Debt Trap
The financial structure supporting this build-out bears a striking resemblance to the vendor-financing bubbles of the past.
The Numbers
CoreWeave Debt: CoreWeave's total debt has ballooned significantly. Recent reports indicate debt commitments may now exceed ~$18.8B. To put this in perspective, they raised over $12.7 billion in debt and equity in a single 12-month window. This isn't just growth capital; it is a desperate race against the obsolescence clock.
The Loop: Nvidia finances its customers (CoreWeave, arguably OpenAI) to buy its own chips. As reported, OpenAI's CFO Sarah Friar has faced scrutiny for suggesting government backstops for trillion-dollar infrastructure commitments.
The Result: When organic revenue growth slows (as Nvidia's has, stepping down to the mid-50s-60% range), this circular financing stops acting as a bridge and starts acting as an anchor.
3. Accelerated Hardware Obsolescence: The "Big Short 2.0"
Nvidia has moved to an annual product cycle: Hopper (2022) → Blackwell (2024) → Rubin (2026).
The Depreciation Mismatch
Economic Life: Leaders like Jonathan Ross (Groq) argue that AI accelerators should be amortized on 1–2 year schedules due to 10x performance leaps per generation.
Accounting Life: Many providers use 6-year depreciation schedules to make their P&L look healthier.
The Consequence: If Blackwell delivers the same workload for 1/10th the power cost of Hopper, the older hardware becomes "OpEx obsolete" instantly. Companies are booking assets for 6 years that may be economically worthless in 24 months.
4. Commoditization: Google TPUs vs. Nvidia Margins
While the market focuses on Nvidia, Google has quietly built a defensive moat.
Google TPUs: Google's TPU v5p and upcoming chips perform on par with Nvidia on inference workloads but offer significantly better performance per watt.
The TCO Reality: Recent analysis suggests Google’s TPU architecture offers a 30–44% lower Total Cost of Ownership (TCO) compared to Nvidia GB200 clusters for inference workloads. When your infrastructure bill is $100M, a 30% savings isn't an efficiency tweak; it's a fiduciary obligation.
The Margin Trap: Because Google uses its own silicon, it doesn't pay Nvidia's 75% margin. Merchant compute providers must pay Nvidia's full margin while competing against hyperscalers who don't.
Frequently Asked Questions: AI Data Center Obsolescence and NVIDIA Vendor Financing
Is the AI infrastructure market in a bubble?
Yes, the data indicates a "financing bubble." While demand for AI is real, the market is distorted by circular financing—where vendors like NVIDIA invest in their own customers (e.g., CoreWeave) to purchase their products. CoreWeave's debt now exceeds $18.8B, raising over $12.7 billion in debt and equity in a single 12-month window. This isn't growth capital; it's a race against the obsolescence clock.
Why is data center cooling such a major problem for AI?
Traditional air cooling works up to ~30kW per rack. Modern AI racks (NVIDIA GB200 NVL72) consume 120kW+ and NVIDIA's roadmap targets 600kW per cabinet by 2027. This requires Direct-to-Chip (DLC) liquid cooling, which most existing data centers cannot support. Retrofitting costs $1-5 million per megawatt, often requiring trenching concrete slabs.
What is the risk of asset mismatch in AI chips?
Companies depreciate GPUs over 6 years for accounting purposes, but chips often become economically obsolete in 2-3 years due to 10x efficiency gains per generation. Jonathan Ross (Groq CEO) argues AI accelerators should be amortized on 1-2 year schedules. Companies are booking assets for 6 years that may be worthless in 24 months, creating massive write-down risk.
What is NVIDIA vendor financing?
NVIDIA vendor financing is circular financing where NVIDIA invests in its own customers to purchase its products. NVIDIA has invested in CoreWeave and arguably OpenAI, creating a loop where vendor capital flows back as GPU purchases. When organic revenue growth slows, this circular financing stops acting as a bridge and becomes an anchor dragging down the ecosystem.
How much does a NVIDIA GB200 NVL72 rack weigh?
A GB200 NVL72 configuration weighs approximately 3,000 lbs. Most legacy raised floors support only 1,500-2,000 lbs per tile. You cannot roll these racks onto standard floors without significant structural reinforcement. Facilities built just 3 years ago face structural obsolescence—they physically cannot support modern AI hardware.
What is Direct-to-Chip liquid cooling?
Direct-to-Chip (DLC) liquid cooling pumps coolant directly to GPU heat sinks rather than relying on air circulation. It requires Coolant Distribution Units (CDUs) and rack-level water loops. This is mandatory for 120kW+ racks because air physically cannot remove that much heat. Most colocation facilities lack this infrastructure, requiring expensive retrofits.
How much does CoreWeave owe in debt?
CoreWeave's total debt commitments now exceed approximately $18.8 billion. They raised over $12.7 billion in debt and equity in a single 12-month window. To put this in perspective, this debt load far outstrips organic revenue. The company is racing against hardware obsolescence, betting that revenue growth will outpace depreciation and interest payments.
How does Google TPU compare to NVIDIA GPUs?
Google's TPU v5p performs on par with NVIDIA on inference workloads but offers 30-44% lower Total Cost of Ownership (TCO). Because Google uses its own silicon, it doesn't pay NVIDIA's 75% gross margin. Merchant compute providers must pay NVIDIA's full margin while competing against hyperscalers who don't—creating an unwinnable economics trap.
What is NVIDIA's product refresh cycle?
NVIDIA has moved to an annual product cycle: Hopper (2022) → Blackwell (2024) → Rubin (2026). Each generation delivers roughly 10x performance improvements. If Blackwell delivers the same workload for 1/10th the power cost of Hopper, older hardware becomes "OpEx obsolete" instantly—regardless of how new it is.
Why are existing data centers becoming obsolete for AI?
AI data center obsolescence is driven by three factors: (1) Power density—facilities built for 6-10kW racks cannot support 120-600kW AI racks; (2) Cooling—air cooling physically cannot remove 120kW+ of heat, requiring liquid cooling infrastructure most facilities lack; (3) Structural limits—3,000 lb racks exceed floor weight ratings. A facility built today for 120kW cannot scale to 600kW without gutting its entire infrastructure.
What is the cost to retrofit a data center for liquid cooling?
Industry estimates peg retrofitting costs at $1-5 million per megawatt in CapEx. For a 50MW facility, you're looking at $50-250 million in unplanned spending just to prepare floors for modern AI racks. This often involves trenching concrete slabs, installing CDUs, and running water loops to every rack position. See also: The Industrialized Data Center Strategy.
Who is Tony Grayson?
Tony Grayson is an independent strategic advisor and former SVP at Oracle ($1.3B budget, 35+ cloud regions), AWS, and Meta (30+ data centers). He commanded nuclear submarine USS Providence (SSN-719) and received the Stockdale Award. His nuclear engineering background — managing thermal loads, coolant systems, and reactor physics — informs his analysis of the physical limits driving AI data center obsolescence. More at tonygrayson.ai
Are AI data centers a bad investment?
They can be — if you're buying into the wrong part of the stack or the wrong vintage of infrastructure. Facilities built for 6-10kW racks that cannot support liquid cooling are stranded assets today. Facilities built for 120kW racks face a second wave of obsolescence when Nvidia's roadmap hits 600kW by 2027. The companies that will survive are those that design for replaceability — modular infrastructure that can swap compute generations without gutting power and cooling. That's the core thesis behind the Industrialized Data Center Strategy. The investment isn't bad — the design assumption is.
Conclusion
In my nuclear submarine training, we had a saying: "You can't argue with physics."
You cannot cool 600kW racks with infrastructure designed for 20kW. You cannot load 6,000 lbs onto a floor rated for 2,000 lbs. And you cannot sustain triple-digit growth indefinitely through vendor financing. The companies that survive will be those that respect the physics and align their balance sheets with the brutal reality of 2-year asset lifecycles.
The lag time is over. The contact is confirmed.
Tony Grayson
"Moving to 600kW isn't an upgrade. It's a complete infrastructure replacement. A facility built today for 120kW cannot simply 'scale up' to 600kW without gutting its entire cooling and electrical backbone."
— Tony Grayson, Independent Strategic Advisor
Related Reading from Tony Grayson:
Sources & Further Reading
____________________________
Tony Grayson is an independent strategic advisor and recognized Top 10 Data Center Influencer.
A former U.S. Navy Submarine Commander and recipient of the VADM Stockdale Award, Tony led global infrastructure as SVP at Oracle ($1.3B budget, 35+ cloud regions), AWS, and Meta (30+ data centers) before his current advisory work. He serves on advisory boards for TerraPower and Holtec International, and as Veterans Chair for Infrastructure Masons.
Read more at tonygrayson.ai




Comments