Nerdy personality

1 item

OpenAI 2026-05-01-2

Where the goblins came from

OpenAI's goblin postmortem buries the lede: reward signals applied to a single personality leaked into base behavior in 76.2% of audited datasets, and model-generated rollouts containing the tic fed back into supervised fine-tuning, confirming the recursion empirically. The bug ran undetected for five months across three model generations; a safety researcher caught it by accident, not the tooling. Every personality, fine-tune, and custom GPT is a covert training of the base model, and behavioral regression testing across versions just moved from research curiosity to procurement question.