Groq — tisram

Dwarkesh Podcast 2026-05-28-1

Reiner Pope on Chip Design from the Bottom Up: Data Movement Dominates Arithmetic 7-to-1, B300's FP4-FP8 Gap as First Crack in NVIDIA's FLOPS Marketing, Splittable Systolic Arrays as Maddox's Architectural Wedge

NVIDIA's B300 datasheet ships FP4 at 3x FP8 speed where precision-scaling theory says 4x — the first public number that doesn't square with marketed FLOPS as a benchmark. The durable accelerator moat is array geometry plus memory hierarchy, not transistor budget: that's why Maddox, Majestic, Groq, and Cerebras all exist as funded alternatives, each architecture matched to a workload profile the general-purpose chip handles inefficiently. By 2027, enterprise procurement moves from NVIDIA versus not to which architectural bet fits the inference batch size.

# tags

ai-economics ai-infrastructure semiconductor nvidia tpu inference-economics hardware-fragmentation custom-chips ai-1.0-defensibility dwarkesh semiconductors gpu-infrastructure compute-supply-chain harness-as-moat agentic-ai-viability edge-ai podcast

The Economist 2026-03-21-3

Nvidia's Full-Stack Reinvention: The $65B Portfolio Isn't a Moat, It's a Dependency Map

The Economist's GTC week profile frames Nvidia's expansion into networking, CPUs, models, and sovereign AI as a strategic reinvention; the article never asks the margin question. Nvidia's $216B revenue at ~73% gross margin is a GPU monopoly number: networking, CPU-only servers, and government bundles don't carry that margin. The $65B investment portfolio ($30B in OpenAI alone) is presented as ecosystem lock-in, but OpenAI already runs inference on Azure custom silicon. The portfolio isn't a moat; it's a subsidy that masks true cost-of-compute and unwinds the moment inference gets cheap enough on non-Nvidia hardware. The buried structural risk: three hyperscalers account for over half of receivables, and those same three are the ones building the substitutes.

# tags

ai-economics ai-1.0-defensibility nvidia inference-economics sovereign-ai

Wall Street Journal 2026-03-17-2

Can Nvidia's Dominance Survive the Sea Change Under Way in AI Computing?

Nvidia's 73% GPU margins are structurally incompatible with an efficiency-first inference economy, but the displacement story isn't "Cerebras replaces Nvidia." Inference is heterogeneous, and Nvidia is racing to sell all three form factors: GPU for training, CPU for orchestration, LPU for inference throughput. The transition from monopolist-margin chipmaker to platform-margin integrator is the real architectural bet at GTC this year.

# tags

ai-infrastructure semiconductors margin-compression inference-economics competitive-dynamics

◆ entities

Nvidia Groq Cerebras OpenAI Jensen Huang AWS

→ threads

ai-economics multi-model-strategy agentic-ai-viability

⟷ links

2026-03-10-1 2026-03-14-1 2026-03-16-3 2026-03-16-2 2026-03-15-1 2026-03-14-3 2026-03-12-3 2026-03-14-2 2026-03-13-w3 2026-03-08-1

permalink

New York Times 2026-03-17-3

Nvidia Built the A.I. Era. Now It Has to Defend It.

Nvidia is the first major chipmaker to unbundle training from inference at the architecture level, pairing its GPUs with Groq's inference-optimized LPUs in a $20B licensing deal. The supply chain math is as interesting as the product: Groq on Samsung fab with no HBM dependency sidesteps both TSMC allocation constraints and memory chip shortages. If inference grows to 70-80% of total AI compute spend, the companies building chip-agnostic inference routing will capture a new middleware layer that doesn't exist yet.

# tags

ai-economics inference custom-silicon supply-chain competitive-dynamics

◆ entities

Nvidia Groq Google Cerebras OpenAI Meta Samsung TSMC

→ threads

ai-economics multi-model-strategy

⟷ links

2026-03-10-1 2026-03-14-1 2026-03-16-3 2026-03-14-3 2026-03-16-2 2026-03-14-2 2026-03-15-1 2026-03-12-3 2026-03-13-w1 2026-03-10-2

permalink