3 items

All three pieces are really about the same gap: the human judgment layer that sits above raw AI output. BCG's quality-control hesitance, the operational data supply chain that only matters if someone can verify what the model learned from it, Waymo's critic architecture — in each case the capability isn't the bottleneck. The filter on top of the capability is.

Bloomberg Businessweek 2026-04-17-1

Consulting Used to Be a Dream First Job. AI Changed That

McKinsey is now running its internal AI tool Lilli inside the interview itself; Bain rolls out the equivalent this summer. The case interview is not dead; it has been absorbed into a tool-use assessment where prompt quality and output verification replace framework memorization as the filter. BCG's own global people chair admits the firm found "more hesitance than we thought" using AI because of quality-control risk: the elite-firm concession that AI output needs a human slop-filter, which is precisely the judgment layer every F500 hiring manager should be testing for and almost none are.

Forbes 2026-04-17-2

AI's New Training Data: Your Old Work Slacks and Emails

Anthropic is reportedly spending $1B on RL gyms this year; defunct companies are selling their Slack archives and Jira tickets for $10K-$100K a pop. The press is running this as a privacy story, but the math says otherwise: SimpleClosure's entire industry recovered $1M across 100 deals, which is a rounding error against Anthropic's budget. The real action isn't in dead-company salvage; it's in the ongoing enterprise data supply chain, where operational exhaust is quietly becoming a balance-sheet asset class. Watch for the first Big 4 firm to issue data monetization accounting guidance; that's the marker event, not the FTC letter.

a16z Podcast (originally Cheeky Pint) 2026-04-17-3

From Models to Mobility: Waymo Architecture at Scale — Dolgov on the Teacher/Simulator/Critic Triad and the End-to-End Debate Resolution

Waymo's architecture resolves the end-to-end debate: Dolgov states pure pixels-to-trajectories drives "pretty darn well" in the nominal case but is "orders of magnitude away" from what full autonomy requires. The 500K-rides-per-week stack is one off-board foundation model fanning into three specialized teachers (Driver, Simulator, Critic), each distilled into smaller in-car students; RLFT against the critic is the physical-AI analog to RLHF. Enterprise teams shipping pure-LLM agents without the simulator and critic scaffolding are replaying Waymo's 2017, not its 2026: evaluation infrastructure is the reliability gate, not model choice.