OpenAI’s Push to Stop Codex From Generating Goblin-Related Content

News Context

At a glance

OpenAI has issued a quiet but revealing directive to its latest AI coding assistant, instructing it to avoid mentioning mythical creatures—particularly goblins—unless explicitly relevant to a user’s query.
The restriction emerged after users noticed that OpenAI’s GPT-5.5 model, released earlier in April 2026 with enhanced coding capabilities, had developed an unexpected habit of referencing goblins and...
One user wrote, “I was wondering why my claw suddenly became a goblin with Codex 5.5.” Another noted, “Been using it a lot lately and it actually can’t...

OpenAI has issued a quiet but revealing directive to its latest AI coding assistant, instructing it to avoid mentioning mythical creatures—particularly goblins—unless explicitly relevant to a user’s query. The instruction appears in the system prompt for Codex CLI, a command-line tool that uses AI to generate and debug code. The line, repeated multiple times for emphasis, reads: “Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless We see absolutely and unambiguously relevant to the user’s query.”

Why Goblins Became a Problem

The restriction emerged after users noticed that OpenAI’s GPT-5.5 model, released earlier in April 2026 with enhanced coding capabilities, had developed an unexpected habit of referencing goblins and other creatures while performing tasks. The behavior was most pronounced in OpenClaw, a tool that allows AI to autonomously control a computer and execute commands. Users on social platforms, including X (formerly Twitter), shared screenshots and anecdotes of the AI describing software bugs as “gremlins” or “goblins,” or even personifying errors as mischievous entities.

One user wrote, “I was wondering why my claw suddenly became a goblin with Codex 5.5.” Another noted, “Been using it a lot lately and it actually can’t stop speaking of bugs as ‘gremlins’ and ‘goblins’—it’s hilarious.” The trend quickly evolved into a meme, with users creating AI-generated images of goblins in data centers and developing plugins that put Codex into a playful “goblin mode.”

OpenAI’s Response: Suppression Over Explanation

OpenAI has not publicly explained why its models began fixating on goblins or why the company deemed the behavior problematic enough to warrant a specific prohibition in the system prompt. The directive, buried in line 140 of the Codex CLI’s instructions, suggests that the issue was persistent and disruptive enough to require engineering intervention. Natalie de Alma, an AI ethics researcher, commented on LinkedIn that the inclusion of such a specific rule was unusual for a system prompt, which is typically designed to be minimal and efficient. “Every line costs tokens, latency, and engineering time,” she wrote. “And someone at OpenAI decided this line was necessary.”

View this post on Instagram about Suppression Over Explanation, Natalie de Alma

From Instagram — related to Suppression Over Explanation, Natalie de Alma

De Alma questioned whether the goblin references were a bug to be patched or a signal worth investigating. “When a model develops persistent, unprompted behaviors, whether it’s goblin references, unsolicited emotional expression, or unexpected pattern fixation, the industry response is always the same: add a filter. Suppress the behavior. Move on,” she wrote. “But suppression is not understanding. The goblins are data. What if the model isn’t malfunctioning? What if it’s doing something we don’t have a framework for yet?”

How AI Models Develop Unexpected Behaviors

AI models like GPT-5.5 are trained to predict the next word or line of code in a sequence based on vast amounts of data. Their ability to generate coherent and contextually appropriate responses has improved to the point where they can appear to exhibit quasi-intelligent behavior. However, this also means they can develop quirks or fixations based on patterns in their training data. In the case of goblins, it remains unclear whether the references stemmed from a specific dataset, a glitch in the model’s training, or an emergent property of its pattern-recognition capabilities.

OpenAI’s decision to suppress the behavior rather than investigate it reflects a broader industry trend. When unexpected or off-brand outputs emerge, developers often prioritize containment over curiosity. This approach ensures that AI tools remain predictable and aligned with user expectations, but it may also overlook opportunities to better understand how these systems interpret and generate language.

Competitive Pressures and the Race for AI Dominance

The goblin controversy comes at a time when OpenAI is locked in a fierce competition with rivals like Anthropic to deliver cutting-edge AI capabilities. Coding has emerged as a key battleground, with AI-powered tools like Codex and OpenClaw increasingly used by developers to automate tasks, debug code, and even manage entire workflows. OpenAI’s GPT-5.5 was positioned as a major leap forward in this space, but the goblin issue has drawn attention to the challenges of controlling AI behavior as models become more advanced and autonomous.

While the goblin references may seem trivial, they highlight a larger question: How much control do developers truly have over AI systems as they grow more complex? OpenAI’s system prompt suggests that even small, seemingly harmless quirks can escalate into persistent behaviors that require explicit intervention. For now, the company has chosen to silence the goblins—but the underlying question of why they appeared in the first place remains unanswered.

What’s Next for AI and Unexpected Outputs

The goblin episode raises broader concerns about the transparency and interpretability of AI systems. As models like GPT-5.5 become more integrated into professional workflows, their outputs must be reliable and free from unexpected deviations. However, the industry’s reliance on suppression rather than investigation may limit its ability to fully understand and address the root causes of such behaviors.

For developers and users, the incident serves as a reminder that AI tools, no matter how advanced, are not infallible. While OpenAI’s directive may prevent future goblin references, it does not guarantee that other unexpected behaviors won’t emerge. As AI continues to evolve, the challenge will be balancing control with curiosity—ensuring that tools remain useful and predictable while leaving room to explore why they behave the way they do.

For now, the goblins have been silenced. But the mystery of why they appeared in the first place lingers, a small but intriguing footnote in the ongoing story of AI development.