compute-scarcity
3 items · chronological order
We're Using So Much AI That Computing Firepower Is Running Out
The compute scarcity thesis just went mainstream: WSJ reports Anthropic's 98.95% uptime as enterprise clients defect to OpenAI, Blackwell GPUs up 48% in two months, and OpenAI killed Sora to free tokens for coding. The buried signal isn't the shortage itself; it's that Retool's CEO switching providers over reliability — not capability — previews what happens when inference demand compounds faster than infrastructure can respond. The company that solves five-nines for AI inference will own enterprise, regardless of whose model benchmarks best.
We're Using So Much AI That Computing Firepower Is Running Out
Retool's CEO switched from Anthropic to OpenAI this quarter, and the reason wasn't a benchmark: it was 98.95% uptime versus the alternative. Enterprise AI competition has shifted from capability to reliability, the same transition cloud infrastructure went through in 2010. The Anthropic paper this week shows the same pattern one layer up: automated alignment research can generate at $22/hour, but generation without stable evaluation infrastructure is just faster reward-hacking. Davies' vigilance decrement argument lands it at the human layer: even if the infrastructure holds, the person reviewing outputs degrades before the system does. Whoever solves five-nines for the full stack, model plus evaluation plus human judgment, owns enterprise regardless of whose Elo score leads.
AI is confronting a supply-chain crunch
Hyperscaler capex grew 190% from 2024 to 2026; their hardware suppliers grew 45%. That gap is why every throttling notice, plan change, and Sora shutdown traces back to the same constraint. The less-discussed dimension: agentic systems need 1 CPU per GPU versus 1:12 for chatbots, which is why Intel has doubled in six months and why every agent platform deck needs a CPU supply slide.