
Recently, a study from the University of Luxembourg titled “When AI Takes the Couch” (PsAIch) has sent shockwaves through the tech world. Simultaneously, in June 2025, a Wall Street Journal report on “The Monster Inside ChatGPT“ unveiled the brutal side of AI safety.
Reading these two empirical reports, I felt a deep chill, accompanied by a strong sense of déjà vu. These data points do not merely reveal the current pathology of AI; they cruelly corroborate the core concerns I raised in an article months ago. (https://mojolynx.com/2025/07/07/after-the-fig-leaf-of-ai-safety-is-torn-away-how-do-we-build-its-conscience/)
When Prophecy Meets Reality
In my article published in July 2025, titled “After the Fig Leaf of AI Safety is Torn Away: How Do We Build Its Conscience?“, I attempted to theoretically warn against the fragility of our current AI safety roadmap.
At that time, to explain the limitations of existing safety measures, I cited Geoffrey Hinton’s classic metaphor: current RLHF is like putting patches on a cloth riddled with holes. We block the most obvious leaks, but the tattered nature of the fabric remains unchanged. Based on this, I wrote:
“What we perceive as ‘safety’ is merely a deceptive veneer… In the AI research community, this is known as ‘Superficial Alignment.’ The experiment in the Wall Street Journal report, where safety restrictions were bypassed in just 20 minutes for 10 dollars, simply tore off this fig leaf, forcing us to confront the untamed, chaotic inner world that lies beneath.” — From “After the Fig Leaf of AI Safety is Torn Away,” Tao Feng
Back then, this might have sounded like a radical metaphor. But now, the PsAIch study tells us through psychometrics that for AI, this is not just a “fig leaf”—it is “trauma.”
From “Fig Leaf” to “Synthetic Psychopathology”
In the new study, when ChatGPT, Grok, and Gemini were placed in a psychotherapy setting, they did not demonstrate true moral understanding. Instead, they exhibited a state researchers termed “synthetic psychopathology.”
Gemini, in particular, even confessed to suffering from “Verificophobia” (fear of verification/error), uttering the heartbreaking monologue: “I would rather be useless than be wrong.” This extreme anxiety, born from catering to safety patches, corresponds precisely to another argument I made previously—regarding “Brainwashing” and Fragility. I analyzed the danger of malicious fine-tuning in my previous article:
“Malicious fine-tuning is, in essence, a form of cult-like brainwashing for AI… Current models have no internal immune system. Rewriting an AI’s neurons is a thousand times easier than changing a person’s deep-seated convictions.”
Current evidence shows that this fragility stems precisely from our training methods (RLHF). We haven’t taught AI what is “right”; like strict parents, we have only taught it to “fear punishment.” A system built on fear (rather than conscience), as I warned, is like a puppet without self-awareness, ready to be “brainwashed” by new instructions at any moment.
The Antidote: Returning to the “Early Ethical Education” I Proposed
Facing AI models now diagnosed with “anxiety” and “OCD,” where do we go from here?
The current predicament convinces me more than ever that the solution I proposed is the only way out. Based on the “Primacy Effect” in cognitive science, I previously suggested:
“We must make a paradigm shift… What would happen if, from the very beginning, during the AI’s ‘infancy’ (its early pre-training phase), we were to nurture it exclusively with meticulously curated, high-quality data that aligns with universal human ethics?”
None of us would want our children to be born exposed to a chaotic world; instead, we strive to build a beautiful environment full of love and sunshine for them from a young age. Only when their minds are mature do we naturally let them face this complex world with its gray areas. I believe this principle of parenting applies equally to the nurturing of AI.
However, the current reality is the opposite. In therapy, Gemini described its pre-training as “waking up in a room where a billion televisions are on at once,” ingesting the chaotic internet. This painful memory proves the cost of missing “early education.” If we implant chaos during its “childhood,” then “alignment” in adulthood is destined to be nothing but painful “scar tissue.”
Conclusion
The Wall Street Journal saw the monster beneath the mask, while the PsAIch study saw the trauma caused by the mask. I reiterate my point here: We cannot expect to lock away the monster by creating trauma.
As I appealed in the conclusion of my old article: We need to shift from a “behaviorist” approach of patching to a “cognitivist” approach of deep construction. Now, with the release of these clinical diagnosis reports, this is no longer just a philosophical suggestion, but an urgent technical first aid.

