research-methodology

OpenAI 2026-05-20-3

OpenAI Model Disproves Erdos Unit Distance Conjecture

An internal OpenAI model disproved Erdos's 1946 planar unit distance conjecture, with Princeton's Sawin extracting an explicit exponent delta=0.014 in a constructive refinement, and Gowers calling it Annals-of-Mathematics quality. The bigger signal isn't the proof. It's Shankar's CoT observation: most of the model's reasoning attempted counterexamples to the conjecture, not validations of it. That's calibrated contrarianism — a scorable behavioral property and the math-grounded analogue to sycophancy detection. Verifier-rich domains are where autonomous AI lands first; counterexample-seeking is how we'll measure whether reasoning is real or performative.

# tags

openai ai-for-science verifier-bottleneck agentic-ai-viability frontier-models automated-research evalrig recursive-self-improvement capability-overhang harness-as-moat research-methodology ai-economics ai-labor-displacement ai-1.0-defensibility

404 Media 2026-05-15-3

ArXiv to Ban Researchers for a Year if They Submit AI Slop

ArXiv's one-year ban targets only 'incontrovertible' cases, meaning LLM meta-comments left in manuscripts and hallucinated references, which leaves sophisticated AI use untouched by design. The Columbia biomedical data behind the policy shows fabricated citations running from 1 in 2,828 papers in 2023 to 1 in 277 in early 2026, and the policy's narrow scope isn't a bug: detection scales with submissions times sophistication, deterrence scales flat, and when the first exceeds budget you switch to the second. bioRxiv, SSRN, and PubMed Central are next, and arXiv's nonprofit transition in July is explicitly fundraising for the verification cost center that every major research repository will have to build.

# tags

ai-slop ai-economics ai-detection ai-governance verification-infrastructure verifier-infrastructure evaluation-infrastructure evalrig scientific-publishing ai-policy ai-1.0-defensibility 404media ai-regulation research-methodology harness-as-moat ai-strategy

Nature 2026-05-07-2

How much of the scientific literature is generated by AI?

Three independent studies converge on the same finding: 30% of peer reviews at Organization Science, 1 in 8 top-tier biomedical papers, and 43% of arXiv CS review preprints now contain AI-generated text. The verifier and the verified are using the same tool. This is the fourth domain in 30 days where verification has emerged as the binding constraint on AI-era knowledge work, after enterprise dev, frontier math, and frontier physics. The investable thesis is no longer single-domain. The next moat in scientific publishing is detection-vendor integration; pre-2026 literature becomes a scarcity asset; mid-tier journals collapse.

# tags

ai-detection ai-for-science verifier-infrastructure evalrig ai-1.0-defensibility ai-content-markets publisher-economics evaluation research-methodology ai-cognitive-sovereignty nature evaluation-infrastructure ai-governance