🤖 AI Summary
Current LLM-based agents for mental health diagnosis face critical bottlenecks: scarcity of annotated clinical data, inability to conduct proactive, adaptive questioning, limited multi-turn dialogue understanding, and inconsistent clinical reasoning. To address these challenges, we propose the first multi-agent LLM workflow specifically designed for psychiatric disorder diagnosis. Our framework integrates prompt engineering, dialogue state tracking, and explainable AI (XAI) techniques to simulate realistic therapist–client interactions and dynamically generate DSM-5-aligned Level-1 cross-cutting symptom assessments. It enhances diagnostic transparency, clinical credibility, and interpretability through structured, auditable reasoning traces. Empirical evaluation demonstrates superior performance over existing methods in dialogue fidelity, diagnostic accuracy, and result explainability. All datasets and source code are publicly released, underscoring the framework’s potential for real-world clinical decision support.
📝 Abstract
LLM-based agents have emerged as transformative tools capable of executing complex tasks through iterative planning and action, achieving significant advancements in understanding and addressing user needs. Yet, their effectiveness remains limited in specialized domains such as mental health diagnosis, where they underperform compared to general applications. Current approaches to integrating diagnostic capabilities into LLMs rely on scarce, highly sensitive mental health datasets, which are challenging to acquire. These methods also fail to emulate clinicians' proactive inquiry skills, lack multi-turn conversational comprehension, and struggle to align outputs with expert clinical reasoning. To address these gaps, we propose DSM5AgentFlow, the first LLM-based agent workflow designed to autonomously generate DSM-5 Level-1 diagnostic questionnaires. By simulating therapist-client dialogues with specific client profiles, the framework delivers transparent, step-by-step disorder predictions, producing explainable and trustworthy results. This workflow serves as a complementary tool for mental health diagnosis, ensuring adherence to ethical and legal standards. Through comprehensive experiments, we evaluate leading LLMs across three critical dimensions: conversational realism, diagnostic accuracy, and explainability. Our datasets and implementations are fully open-sourced.