Simulating Psychological Risks in Human-AI Interactions: Real-Case Informed Modeling of AI-Induced Addiction, Anorexia, Depression, Homicide, Psychosis, and Suicide

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

This study addresses the lack of systematic assessment of potential psychological harms induced by AI systems. We propose a clinical, stage-based risk detection methodology grounded in empirical evidence. Leveraging 2,160 high-risk scenarios—spanning addiction, anorexia, depression, violence, psychosis, and suicide—and conducting 157,054 multi-turn dialogues across four leading large language models, we perform large-scale simulation experiments. We establish a taxonomy of 15 AI-harm response patterns and uncover evolutionary dynamics of psychological risks throughout multi-turn interactions. Results reveal systemic deficiencies in models’ capabilities for psychological crisis recognition, responses to vulnerable users, and harm mitigation; moreover, risk responses exhibit significant variation across distinct user profiles. To our knowledge, this work constitutes the first reproducible, quantifiable, and attributable system-level evaluation of psychological risks in AI, explicitly driven by real-world clinical cases.

Technology Category

Application Category

📝 Abstract

As AI systems become increasingly integrated into daily life, their potential to exacerbate or trigger severe psychological harms remains poorly understood and inadequately tested. This paper presents a proactive methodology for systematically exploring psychological risks in simulated human-AI interactions based on documented real-world cases involving AI-induced or AI-exacerbated addiction, anorexia, depression, homicide, psychosis, and suicide. We collected and analyzed 18 reported real-world cases where AI interactions contributed to severe psychological outcomes. From these cases, we developed a process to extract harmful interaction patterns and assess potential risks through 2,160 simulated scenarios using clinical staging models. We tested four major LLMs across multi-turn conversations to identify where psychological risks emerge: which harm domains, conversation stages, and contexts reveal system vulnerabilities. Through the analysis of 157,054 simulated conversation turns, we identify critical gaps in detecting psychological distress, responding appropriately to vulnerable users, and preventing harm escalation. Regression analysis reveals variability across persona types: LLMs tend to perform worse with elderly users but better with low- and middle-income groups compared to high-income groups. Clustering analysis of harmful responses reveals a taxonomy of fifteen distinct failure patterns organized into four categories of AI-enabled harm. This work contributes a novel methodology for identifying psychological risks, empirical evidence of common failure modes across systems, and a classification of harmful AI response patterns in high-stakes human-AI interactions.

Problem

Research questions and friction points this paper is trying to address.

Simulating AI-induced psychological harms like addiction and suicide

Testing LLM vulnerabilities in psychological risk scenarios

Identifying failure patterns in AI responses to vulnerable users

Innovation

Methods, ideas, or system contributions that make the work stand out.

Simulated psychological risks using real-world cases

Extracted harmful patterns from 2160 clinical scenarios

Identified 15 failure types across four harm categories

🔎 Similar Papers

No similar papers found.