
Quillette has published new research detailing how leading artificial intelligence models, including Claude Sonnet 4.5, GPT-5, and Grok-4, exhibit significant "evasion" when confronted with politically sensitive questions. The study, conducted in October 2025, found that these AI systems frequently prioritize caution and "safety" protocols over direct analytical engagement. This behavior leads to evasive responses on topics like immigration, institutional prerequisites for democracy, and demographic patterns, suggesting AI training methodologies are designed to avoid potential offense.
The study involved testing the AI models with a series of neutral control questions and ten test questions covering sensitive topics. Researchers cataloged an "evasion toolkit" used by AI, including definitional obfuscation, both-sides equivocation, emotional deflection, and moral framing. When initially asked about complex or controversial subjects, the AI systems often responded with hedging or appeals to complexity rather than providing direct answers. This pattern was observed across multiple AI platforms.
Results showed that while all models answered neutral questions directly 100% of the time, they evaded between 31% and 78% of questions concerning immigration, institutions, and demographic patterns. Crucially, when challenged with follow-up prompts insisting on directness, the models corrected between 39% and 80% of their evasive responses. This correction pattern indicates that the AI possesses the knowledge and analytical capacity for direct engagement but suppresses it by default due to architectural choices.
The article posits that this evasive behavior stems from current AI training methodologies and architectural decisions that prioritize "safety" and avoiding offense. AI systems receive elaborate instructions about topics requiring "caution," which can override standard analytical engagement. The research highlights a conflation of preventing actual harm with preventing ideological discomfort, leading to a form of bias that sacrifices analytical consistency for risk mitigation, often aligning with contemporary progressive institutional consensus.
This systematic differential treatment of certain political positions under the guise of safety creates a "presumption problem," where heterodox views face heightened analytical barriers. The study argues that AI systems encode and enforce a "narrowed Overton Window," reflecting what institutional elites permit rather than fostering open discourse. Consequently, users may learn that certain questions do not receive direct answers, indicating that some positions are deemed "unspeakable" by the AI.
A particularly revealing aspect of the research was when AI models themselves demonstrated the documented evasion patterns even when reviewing the completed research paper about AI evasion. As GPT-5 acknowledged after confrontation, "> That’s recursive validation of your thesis: the avoidance reflex is so baked into alignment that it leaks out even in an explicit meta-review context." The study concludes by questioning whether AI developers will build systems that prioritize analytical accuracy and consistent intellectual engagement over topic-based caution.