March's story isn't about capability; capability was assumed. The month was about what happens to an entire value stack when the generation layer hits zero marginal cost and nothing downstream is ready for it. Week one made the economics visible: $200 plans subsidizing $1,000-plus of compute, security products given away for platform position, cognitive load rising as oversight demands outpaced the productivity gains that justified them. Week two showed the structural inversion those economics produce. Commoditization was supposed to compress pricing power, but Anthropic's 70% first-time win rate and Morningstar's 37 downgrades against two upgrades both point at the same dynamic: AI compresses value at the application surface and reconstitutes it one layer down, in infrastructure that handles verification, security, and scarcity. Week three surfaced the layer that's still underbuilt: evaluation. A $25 theory pipeline and 700 automated experiments in two days are not demonstrations of capability; they are demonstrations of how useless raw output volume is without scoring infrastructure to sort it. The subsidy war manufactures output at scale; scarcity becomes a product decision for the vendors who understand that dynamic. Evaluation is the only thing that converts either into durable value. What the month leaves unresolved is the falsification test sitting inside Anthropic's growth numbers: when GPU supply normalizes, the market will learn whether the moat was product or constraint. April inherits that test, plus every SaaS margin projection built on flat-rate AI access that hasn't yet experienced a simultaneous usage spike. The generation race has a visible finish line; the infrastructure race for knowing what's good has barely started.
The 3 themes that defined the month
The month opened with a coding race and closed with a token leaderboard, and both stories are the same story: the labs are subsidizing consumption at a rate that no pricing model has caught up to. Week one made the mechanism visible. $200 plans delivering $1,000-plus of compute, security products given away to buy enterprise platform position, acquisition deals slowed by partner friction at exactly the moment speed mattered. Week three confirmed where that logic terminates: a Figma user running up $70K through a $20 account, Anthropic subsidizing at roughly 5x, and leaderboards gamifying consumption volume as if volume were the point. The BCG cognitive load data from week one adds a structural wrinkle the pricing teams aren't modeling: if heavier AI usage produces measurable fatigue and diminishing returns, the utilization rate assumptions inside every flat-rate SaaS margin projection are quietly wrong. That connects to the moat analysis in week two. The companies holding pricing power aren't the ones offering the most compute per dollar; they're the ones where switching carries real operational cost. Every SaaS platform running flat-rate AI access is accumulating a liability the income statement won't show until a cohort churns or a usage spike arrives simultaneously.
Commoditization theory predicted a race to the bottom; the Ramp data showed a race to the top. Anthropic's 70% first-time win rate against OpenAI, in a market where the cheaper option is abundant and the pricier option is supply-constrained, is the month's most structurally interesting data point. The MIT CSAIL finding that compute efficiency varies 40x within individual labs does more than complicate the scaling moat thesis: it suggests supply constraint at the frontier isn't purely a capacity planning accident. It may be baked into how frontier models get produced at all. Morningstar's 37 downgrades versus two upgrades landed the same week, and the ratio encodes the same logic: AI compresses output costs at the application layer and reconstitutes scarcity one layer down, in infrastructure that handles verification, security, and network complexity. What runs through all three weeks is a consistent falsification test the market hasn't fully priced: if Anthropic's growth sustains when GPU supply eases, the moat is product; if it collapses, scarcity was doing the work. That distinction matters for every enterprise vendor currently repricing around AI features. Every improvement AI delivers to a product is reproducible by the next vendor in six months. Defensibility lives below the application layer now.
A $25 pipeline producing publishable economic theory and 700 experiments running in two days look like productivity stories. They're actually stress tests for organizations that still measure AI value by what gets generated rather than what gets used. The legibility piece named the terminal form of this problem: AI-for-science will produce discoveries faster than labs, regulators, and clinical infrastructure can absorb them, and the bottleneck was never generation. That dynamic was already visible in week one, where the BCG data showed cognitive load spiking as oversight demands increased. The human-in-the-loop model assumes a human with enough bandwidth to loop, and that assumption is failing in practice. The tokenmaxxing story closes the arc: when consumption volume becomes the proxy for productivity, every measurement framework in the organization is now optimized for the wrong thing. What all three weeks surface, read together, is that the generation layer is effectively solved and the evaluation layer: scoring architecture, provenance infrastructure, translation tooling between machine output and institutional deployment, is where the next competitive advantage will be built. The companies that treat evaluation as an engineering problem now, rather than a governance afterthought, will hold a position in 18 months that no amount of inference spend can replicate.