cognitive-load

7 items

Wall Street Journal 2026-04-26-3

AI Is Cannibalizing Human Intelligence (Vivienne Ming, WSJ)

Ming's Polymarket experiment splits human-AI usage into three measurable patterns: oracle (use the answer), validator (use AI to confirm priors), cyborg (use AI as sparring partner). Validators perform worse than AI alone — sycophancy laundered as evidence — while the 5-10% of cyborgs match or beat prediction-market consensus. The unbuilt premium category is AI that disagrees with you on purpose; today's benchmarks measure what AI does alone, not whether the product is building human capacity or consuming it.

Back of Mind · 2026-04-16 2026-04-17-w3

The Most Important Number

Dan Davies asks how many words of AI output a manager can actually verify per day before judgment silently degrades, and the honest answer is that almost no organization has tried to find out. The self-driving car literature documented this vigilance decrement precisely; the same cognitive dynamic applies to anyone reviewing model outputs at volume, and unlike physical fatigue it's invisible to the person experiencing it. The Anthropic alignment paper this week hit the same wall at the research level: automated generation scaled, evaluation didn't, and the production failure on Sonnet 4 is the visible edge of that gap. The WSJ piece shows what it looks like at the infrastructure level: reliability became the competitive moat the moment generation capacity exceeded the enterprise's ability to trust it. Organizations are measuring tokens per second and cost per query; the number that will actually constrain their AI leverage is one nobody is tracking.

Back of Mind 2026-04-16-3

The Most Important Number

Dan Davies identifies the number nobody wants to find: how many words of AI output can a manager verify per day before judgment silently degrades? The self-driving car literature already answered this for monitoring tasks; the same vigilance decrement applies to AI output review. Organizations will systematically overestimate their people's verification capacity, and unlike physical exhaustion, cognitive degradation is invisible to the person experiencing it. The binding constraint on AI leverage isn't generation capability; it's human verification throughput, and we're structurally incentivized never to measure it.

tisram.ai 2026-03-31-m3

Evaluation Is the Layer Nobody Built

A $25 pipeline producing publishable economic theory and 700 experiments running in two days look like productivity stories. They're actually stress tests for organizations that still measure AI value by what gets generated rather than what gets used. The legibility piece named the terminal form of this problem: AI-for-science will produce discoveries faster than labs, regulators, and clinical infrastructure can absorb them, and the bottleneck was never generation. That dynamic was already visible in week one, where the BCG data showed cognitive load spiking as oversight demands increased. The human-in-the-loop model assumes a human with enough bandwidth to loop, and that assumption is failing in practice. The tokenmaxxing story closes the arc: when consumption volume becomes the proxy for productivity, every measurement framework in the organization is now optimized for the wrong thing. What all three weeks surface, read together, is that the generation layer is effectively solved and the evaluation layer: scoring architecture, provenance infrastructure, translation tooling between machine output and institutional deployment, is where the next competitive advantage will be built. The companies that treat evaluation as an engineering problem now, rather than a governance afterthought, will hold a position in 18 months that no amount of inference spend can replicate.

CNBC 2026-03-26-2

Vivienne Ming: Robot-Proof Children and the Nemesis Prompt

Ming's book-promo piece wraps consensus education-reform thesis in neuroscience credibility, but the one genuinely product-ready idea is the Nemesis Prompt: kids produce a first draft, an LLM adversarially attacks it, then the kid evaluates which critiques hold. That three-step loop is a design pattern for any AI-assisted creation tool, not just parenting advice. The real test for every AI learning product: does the user get worse when you turn it off? Most ed-tech fails that test because it optimizes for answer delivery, not capacity building. The underserved category is adversarial AI tutoring: tools that make your thinking harder, not easier. Harder sell to consumers, but institutional buyers running L&D programs should be asking whether their AI integration is building dependency or judgment.

HBR · 2026-03-11 2026-03-13-w3

When Using AI Leads to "Brain Fry"

Three AI tools is where the productivity curve flattens. BCG's data shows intensive agent oversight produces a distinct cognitive fatigue, which runs directly counter to the "human in the loop" orthodoxy underlying most enterprise AI governance. The buried signal: autonomous agents requiring less oversight may produce better human outcomes than copilot patterns demanding constant attention, reframing the safety argument for more autonomous systems from ethical preference to operational necessity. If $1,000-plus of compute delivered monthly for $200 requires sustained human supervision to be trustworthy, the productivity math degrades faster than the pricing math improves. The causal language in a cross-sectional self-report survey deserves skepticism, and the prescription is indistinguishable from a BCG engagement scope, but the structural observation holds regardless of who funded it. Organizations deploying more AI tools without redesigning oversight models are accumulating cognitive debt, not compounding returns.

HBR 2026-03-11-3

When Using AI Leads to "Brain Fry"

BCG-authored survey (n=1,488) coins "AI brain fry" – cognitive fatigue from intensive agent oversight, distinct from burnout. The three-tool productivity ceiling and oversight-as-binding-constraint findings are genuinely useful; the causal language on cross-sectional self-report data is not. The buried signal: autonomous agents requiring less oversight may produce better human outcomes than copilot patterns requiring constant attention – running directly counter to "human in the loop" orthodoxy. The prescription (organizational change management, leadership clarity) is indistinguishable from a BCG engagement scope.