SSRN — tisram

SSRN · 2026-03-26 2026-03-27-w2

Can LLMs Discover Novel Economic Theories?

A $25 pipeline generated 257 economic theories and independently converged on the same mechanism a human researcher published months later — not as a curiosity, but as a stress test for every organization currently spending on AI-powered generation. When the cost of producing candidates collapses to noise, the constraint shifts entirely to knowing which candidates are good. That's the connection to tokenmaxxing: both stories are about the same missing layer, the scoring infrastructure that converts output volume into output value. The Karpathy Loop works precisely because it starts with a measurable metric and a stopping criterion — the constraint is the insight, not the generation. Organizations that build deterministic scoring architecture now, with LLM judgment in a minority role, will compound their lead; the ones optimizing for generation throughput are manufacturing commodities at scale.

# tags

agentic-ai ai-economics ai-for-science evaluation

SSRN 2026-03-26-3

Can LLMs Discover Novel Economic Theories?

An automated pipeline generated 257 candidate economic theories for two open asset pricing puzzles at a total cost of $25: the system independently converged on the same limited-participation mechanism a human researcher published months later. The real finding isn't that LLMs can theorize; it's that when generation costs collapse to zero, the only defensible position is evaluation infrastructure. Every org pouring money into AI-powered generation should be spending 10x more on scoring architecture: deterministic anchors carrying majority weight, LLM judgment in the minority.

# tags

ai-economics agentic-ai evaluation ai-for-science

◆ entities

gpt-oss-120b SSRN Li and Lin DeepInfra

→ threads

ai-economics agentic-ai-viability reliability

⟷ links

2026-03-24-1 2026-03-25-2 2026-03-21-1 2026-03-20-3 2026-03-08-1 2026-03-13-w3 2026-03-21-2 2026-03-10-2 2026-03-20-w2 2026-03-20-w1

permalink