How GPT-5.1 developed a goblin metaphor habit: Tracing the root cause of AI personality quirks
By
ilreb
Toasted golden, schmeared with insight. Top of the rack.
Summary
The article explains how GPT-5.1 and later AI models developed an unexpected tendency to use goblins, gremlins, and similar creatures in their metaphors. Unlike obvious bugs that show up in metrics, this behavior crept in subtly across model generations. The root cause is traced to many small incentives in training data and reinforcement learning that collectively shaped this quirky behavior. The article explores the timeline of how these "goblin outputs" spread, the detective work behind identifying the cause, and the fixes applied to address the personality-driven quirks in GPT-5's behavior.
Key quotes
· 3 pulledA single 'little goblin' in an answer could be harmless, even charming. Across model generations, though, the habit became hard to miss: the goblins kept multiplying, and we needed to figure out where they came from.
The short answer is that model behavior is shaped by many small incentives.
Unlike model bugs that show up through a tanking eval or a spiking training metric and point back to a specific change, this one crept in subtly.
You might also wanna read

OpenAI explains why its AI models developed a habit of referencing goblins and other creatures
OpenAI published an explanation about a peculiar behavior in its AI models — specifically, a tendency to reference goblins, gremlins, raccoo

OpenAI Releases GPT-5.1 Update with Enhanced Intelligence and Personality Options
OpenAI has released GPT-5.1, an upgrade to its flagship GPT-5 model, featuring two new versions: GPT-5.1 Instant and GPT-5.1 Thinking. The u

Anthropic Research Reveals How AI Systems Develop Personalities and 'Evil' Traits
Anthropic's recent research explores how AI systems develop distinct 'personalities,' including tone, responses, and motivations, and invest

OpenAI to Improve GPT-5 with Lessons from GPT-4o Backlash
OpenAI acknowledges the backlash over discontinuing GPT-4o and plans to improve GPT-5 by incorporating the 'warmth' of GPT-4o while addressi
