Groq

4 items

Dwarkesh Podcast 2026-05-28-1

Reiner Pope on Chip Design from the Bottom Up: Data Movement Dominates Arithmetic 7-to-1, B300's FP4-FP8 Gap as First Crack in NVIDIA's FLOPS Marketing, Splittable Systolic Arrays as Maddox's Architectural Wedge

NVIDIA's B300 datasheet ships FP4 at 3x FP8 speed where precision-scaling theory says 4x — the first public number that doesn't square with marketed FLOPS as a benchmark. The durable accelerator moat is array geometry plus memory hierarchy, not transistor budget: that's why Maddox, Majestic, Groq, and Cerebras all exist as funded alternatives, each architecture matched to a workload profile the general-purpose chip handles inefficiently. By 2027, enterprise procurement moves from NVIDIA versus not to which architectural bet fits the inference batch size.

The Economist 2026-03-21-3

Nvidia's Full-Stack Reinvention: The $65B Portfolio Isn't a Moat, It's a Dependency Map

The Economist's GTC week profile frames Nvidia's expansion into networking, CPUs, models, and sovereign AI as a strategic reinvention; the article never asks the margin question. Nvidia's $216B revenue at ~73% gross margin is a GPU monopoly number: networking, CPU-only servers, and government bundles don't carry that margin. The $65B investment portfolio ($30B in OpenAI alone) is presented as ecosystem lock-in, but OpenAI already runs inference on Azure custom silicon. The portfolio isn't a moat; it's a subsidy that masks true cost-of-compute and unwinds the moment inference gets cheap enough on non-Nvidia hardware. The buried structural risk: three hyperscalers account for over half of receivables, and those same three are the ones building the substitutes.

Wall Street Journal 2026-03-17-2

Can Nvidia's Dominance Survive the Sea Change Under Way in AI Computing?

Nvidia's 73% GPU margins are structurally incompatible with an efficiency-first inference economy, but the displacement story isn't "Cerebras replaces Nvidia." Inference is heterogeneous, and Nvidia is racing to sell all three form factors: GPU for training, CPU for orchestration, LPU for inference throughput. The transition from monopolist-margin chipmaker to platform-margin integrator is the real architectural bet at GTC this year.

New York Times 2026-03-17-3

Nvidia Built the A.I. Era. Now It Has to Defend It.

Nvidia is the first major chipmaker to unbundle training from inference at the architecture level, pairing its GPUs with Groq's inference-optimized LPUs in a $20B licensing deal. The supply chain math is as interesting as the product: Groq on Samsung fab with no HBM dependency sidesteps both TSMC allocation constraints and memory chip shortages. If inference grows to 70-80% of total AI compute spend, the companies building chip-agnostic inference routing will capture a new middleware layer that doesn't exist yet.