verifier-infrastructure
5 items · chronological order
The Man Behind AlphaGo Thinks AI Is Taking the Wrong Path
David Silver left DeepMind to raise $1.1B at $5.1B for Ineffable Intelligence on a thesis that says LLMs hit a ceiling defined by the human-data manifold and only RL-trained agents in simulations can break through. The architectural argument has teeth: AlphaGo's Move 37 came from outside human play, and Sutton just won the Turing Award for the foundational work. The unspoken bottleneck if Silver is right isn't compute or data, it's verifiers — reliable scoring functions for unbounded domains like science, governance, novel discovery — and that is the quiet investable category nobody's pricing yet.
Andrej Karpathy: From Vibe Coding to Agentic Engineering
Karpathy's December 2025 trust threshold is a behavioral signal more telling than any benchmark: senior practitioners stopped correcting agent outputs. The sharper insight sits in the MenuGen demo, where one Gemini Nano Banana call replaced an entire Vercel app stack; that collapse turns 'should this app exist at all' into the new build-evaluation primitive for 2026. Verifiability is where iteration compounds, which makes the verification environment, not the model or the prompt, the durable position in agentic AI.
Andrej Karpathy: From Vibe Coding to Agentic Engineering
Karpathy's trust threshold is the most telling data point in the piece: senior practitioners stopped correcting agent outputs in December 2025, not because agents became perfect, but because the correction cost exceeded the perceived value of intervening. The MenuGen demo makes the structural consequence concrete: one Gemini Nano Banana call replaced an entire Vercel app stack, which reframes the build decision from 'how should we architect this' to 'should this app exist at all.' That reframing connects to both other picks this week. Silver is betting that the next capability jump requires simulation environments and reliable scoring; the goblin postmortem confirms that without those, systems optimize for the wrong thing silently and at scale. The durable position in agentic AI isn't the model or the prompt or even the agent: it's the verification environment, the infrastructure that makes iteration trustworthy enough to trust.
The Man Behind AlphaGo Thinks AI Is Taking the Wrong Path
David Silver raised $1.1B at a $5.1B valuation on the argument that LLMs are bounded by the human-data manifold, and that the only way out is RL-trained agents operating in simulation. The architectural evidence is real: AlphaGo's Move 37 came from outside the space of human play, and Sutton's Turing Award validates the theoretical foundation Silver is building on. What this week's picks clarify is that the capability argument is almost beside the point: the OpenAI goblin postmortem shows that even current systems can't reliably control what they're optimizing for, and Karpathy's MenuGen demo shows that the harness around the model is already more consequential than the model itself. Silver's unpriced bottleneck, reliable verifiers for unbounded domains, is also the missing piece in both of those stories. The next value pool isn't in bigger models or better prompts; it's in the infrastructure that tells you whether the output was actually right.
How much of the scientific literature is generated by AI?
Three independent studies converge on the same finding: 30% of peer reviews at Organization Science, 1 in 8 top-tier biomedical papers, and 43% of arXiv CS review preprints now contain AI-generated text. The verifier and the verified are using the same tool. This is the fourth domain in 30 days where verification has emerged as the binding constraint on AI-era knowledge work, after enterprise dev, frontier math, and frontier physics. The investable thesis is no longer single-domain. The next moat in scientific publishing is detection-vendor integration; pre-2026 literature becomes a scarcity asset; mid-tier journals collapse.