nvidia

9 items

The Verge 2026-06-02-1

Microsoft to unveil new AI models and Windows improvements at Build

Build 2026 is a developer-trust-repair operation with a second plot running underneath it. Microsoft is assembling the full OpenAI-independence stack: its first reasoning model trained without distillation, its own image models, a new agent, and a hard push toward local inference on Windows silicon. The "no distillation" detail is the tell — Microsoft wants to prove it can train reasoning without learning from another model's outputs.

⟷ links
art_20260403_microsoft-mid-class-model-admission-compart_20260529_engadget-microsoft-s-buttoned-up-copilot2026-03-27-32026-04-07-2
The Verge 2026-06-02-3

Microsoft and OpenAI broke up — now they're ready to fight

At Build 2026, Suleyman did the rarest thing an AI exec can do: ranked his own company outside the top tier. The humility is the strategy, not a weakness. Microsoft is shipping from-scratch models, custom silicon, and a vendor-neutral Windows-native harness while explicitly competing on cost, distribution, and 11,000-model optionality rather than capability. The frontier-lab leaderboard the press scores is the wrong scoreboard; whoever owns enterprise distribution, governance, and the cheapest good-enough model captures the value, and Microsoft is deliberately choosing to fight there.

Dwarkesh Podcast 2026-05-28-1

Reiner Pope on Chip Design from the Bottom Up: Data Movement Dominates Arithmetic 7-to-1, B300's FP4-FP8 Gap as First Crack in NVIDIA's FLOPS Marketing, Splittable Systolic Arrays as Maddox's Architectural Wedge

NVIDIA's B300 datasheet ships FP4 at 3x FP8 speed where precision-scaling theory says 4x — the first public number that doesn't square with marketed FLOPS as a benchmark. The durable accelerator moat is array geometry plus memory hierarchy, not transistor budget: that's why Maddox, Majestic, Groq, and Cerebras all exist as funded alternatives, each architecture matched to a workload profile the general-purpose chip handles inefficiently. By 2027, enterprise procurement moves from NVIDIA versus not to which architectural bet fits the inference batch size.

isaiprofitable.com 2026-05-26-2

Is AI Profitable Yet? — $1.4T Spend vs $613B Revenue, Attribution as the Unfalsifiable Hinge

A solo-dev dashboard puts cumulative industry AI spend at $1.4T against $613B in direct revenue — 33% recovery for pure labs, 7% for hyperscalers, and NVIDIA the only company in the dataset where AI revenue is actually cash-generative. The methodology excludes indirect revenue (Search ad lift, Copilot bundle stickiness, Bedrock attach) because attribution is genuinely unreliable, which is precisely the part the bull case depends on. Bull and bear are consistent with the same data; in public markets, unfalsifiable narratives don't unwind gradually.

Axios 2026-05-21-2

Two hours that changed AI

Anthropic's first profitable quarter is the wrong headline. The $559M of operating profit will fund $1.25B per month of compute commitments to Elon Musk's SpaceX through 2029 — roughly $15B per year flowing to a single counterparty who also runs xAI. Lab IPO valuations need a compute-supplier-concentration discount that nobody is modeling, and Axios packaging six scheduled disclosures as "two hours that changed AI" is itself the late-cycle consensus marker.

The Deep View 2026-05-07-1

OpenAI MRC Protocol: What Gets Open-Sourced Is the Non-Moat

What frontier labs open-source is a map of the non-moats. OpenAI released its GPU networking protocol through OCP with Microsoft, AMD, Broadcom, NVIDIA, and Intel as coalition partners, two years in development, already running at Stargate's Abilene site and used to train GPT-5.5. The corollary lands hardest for Microsoft: they have the protocol, run it on Fairwater, and still ship mid-class models, which means networking efficiency was never the binding constraint.

The Economist 2026-04-29-1

AI is confronting a supply-chain crunch

Hyperscaler capex grew 190% from 2024 to 2026; their hardware suppliers grew 45%. That gap is why every throttling notice, plan change, and Sora shutdown traces back to the same constraint. The less-discussed dimension: agentic systems need 1 CPU per GPU versus 1:12 for chatbots, which is why Intel has doubled in six months and why every agent platform deck needs a CPU supply slide.

CNBC 2026-03-24-2

Nvidia's Huang pitches AI tokens on top of salary as agents reshape how humans work

Jensen Huang isn't selling GPUs at GTC: he's selling the accounting category that makes buying them non-discretionary. Tokens-as-compensation reclassifies compute from IT discretionary to people cost; if that framing sticks, AI budgets become as unkillable as headcount. The buried lede is the 80-85% AI project failure rate since 2018 sitting in paragraph 25 while Huang envisions "hundreds of thousands of digital employees" in paragraph 7. That gap between aspiration and execution is the real signal: the demand narrative for compute is bulletproof, but agent reliability at scale remains the unpriced risk.

The Economist 2026-03-21-3

Nvidia's Full-Stack Reinvention: The $65B Portfolio Isn't a Moat, It's a Dependency Map

The Economist's GTC week profile frames Nvidia's expansion into networking, CPUs, models, and sovereign AI as a strategic reinvention; the article never asks the margin question. Nvidia's $216B revenue at ~73% gross margin is a GPU monopoly number: networking, CPU-only servers, and government bundles don't carry that margin. The $65B investment portfolio ($30B in OpenAI alone) is presented as ecosystem lock-in, but OpenAI already runs inference on Azure custom silicon. The portfolio isn't a moat; it's a subsidy that masks true cost-of-compute and unwinds the moment inference gets cheap enough on non-Nvidia hardware. The buried structural risk: three hyperscalers account for over half of receivables, and those same three are the ones building the substitutes.