podcast

2 items

Dwarkesh Podcast 2026-05-28-1

Reiner Pope on Chip Design from the Bottom Up: Data Movement Dominates Arithmetic 7-to-1, B300's FP4-FP8 Gap as First Crack in NVIDIA's FLOPS Marketing, Splittable Systolic Arrays as Maddox's Architectural Wedge

NVIDIA's B300 datasheet ships FP4 at 3x FP8 speed where precision-scaling theory says 4x — the first public number that doesn't square with marketed FLOPS as a benchmark. The durable accelerator moat is array geometry plus memory hierarchy, not transistor budget: that's why Maddox, Majestic, Groq, and Cerebras all exist as funded alternatives, each architecture matched to a workload profile the general-purpose chip handles inefficiently. By 2027, enterprise procurement moves from NVIDIA versus not to which architectural bet fits the inference batch size.

a16z Podcast (originally Cheeky Pint) 2026-04-17-3

From Models to Mobility: Waymo Architecture at Scale — Dolgov on the Teacher/Simulator/Critic Triad and the End-to-End Debate Resolution

Waymo's architecture resolves the end-to-end debate: Dolgov states pure pixels-to-trajectories drives "pretty darn well" in the nominal case but is "orders of magnitude away" from what full autonomy requires. The 500K-rides-per-week stack is one off-board foundation model fanning into three specialized teachers (Driver, Simulator, Critic), each distilled into smaller in-car students; RLFT against the critic is the physical-AI analog to RLHF. Enterprise teams shipping pure-LLM agents without the simulator and critic scaffolding are replaying Waymo's 2017, not its 2026: evaluation infrastructure is the reliability gate, not model choice.