Scientific American 2026-03-25-2

First Proof Challenge: AI Solves Half of Novel Math Lemmas, But Can't Invent New Math

Eleven mathematicians posed 10 unpublished research lemmas to AI: public models solved 2, scaffolded in-house systems hit 5-6. The score matters less than how they solved them: brute-force assembly of existing tools, not invention of new abstractions. That's the same ceiling every enterprise hits. AI is a spectacular research assistant and a mediocre strategist. The 3x jump from multi-agent scaffolding, not model upgrades, tells you where the real capability gains live. And Lauren Williams' attribution finding generalizes far beyond math: if you can't separate human from AI contribution in formal proofs, you definitely can't in your quarterly business review.