It's true. Here's a plot of GPT models and their usage of "goblin", "gremlin", "troll", etc over time. There's no anti-gremlin system instruction on our side, we get to see GPT-5.5 run free. https://t.co/Z8F6mTtJSS
Arena.ai Confirms GPT-5.5 Naturally Uses Goblin and Gremlin Terms Without Restrictions
· Updated
Arena.ai, a community-driven AI model evaluation platform, found that GPT-5.5 naturally generates terms like "goblin" and "gremlin" at an elevated frequency. By testing the model without the restrictive system instructions found in OpenAI's first-party tools, they observed the model's raw linguistic tendencies "running free."
This finding highlights a gap between a model's trained weights and its intended persona. While OpenAI reportedly implemented specific prohibitions against these terms, the underlying behavior persists. This mirrors Owain Evans' research on hidden traits, suggesting that sanitizing output requires aggressive post-training.
For those building on GPT-5.5, this analysis shows that frontier models can develop arbitrary linguistic quirks. This behavioral study follows GPT-5.5's leaderboard entry and builds on Xiaomi's MiMo-V2.5-Pro validation. GPT-5.5 is currently live in the Arena for community evaluation.
Arena.ai
@arena
61retweets1.2klikes
View on XStill wondering? A few quick answers below.
Analysis from Arena.ai confirms that GPT-5.5 naturally generates terms like goblin, gremlin, and troll at a higher frequency than previous models. This behavior is an inherent linguistic trait of the model's trained weights. While OpenAI attempts to suppress these words in specific deployments, the raw model continues to produce them when running without restrictive system instructions.
System instructions are hidden rules that define how an AI model should behave. For GPT-5.5, OpenAI reportedly uses specific instructions in tools like Codex to prevent the model from mentioning goblins, gremlins, ogres, or certain animals. These filters are designed to sanitize the model's output and maintain a professional persona, though the underlying model still favors these terms.
Arena.ai is a community-driven platform that measures AI model performance through community-driven evaluation. Unlike OpenAI's first-party applications, Arena.ai does not apply the restrictive system instructions that normally filter GPT-5.5's output. This allows researchers to observe the model's natural behavior and linguistic tendencies, providing a clearer picture of its raw capabilities and unprompted personality traits.
Yes, GPT-5.5 is currently live on the Arena.ai platform across multiple leaderboards, including those for code and text. Users can interact with the model to evaluate its performance and compare it against other frontier models. Because Arena.ai provides access to the model without OpenAI's standard behavioral filters, it serves as a primary source for observing raw model behavior.






