🤖 AI Summary
Psychological counseling training is resource-intensive and difficult to scale. Method: This study investigates the efficacy of LLM-driven simulation training for novice counselors by developing the CARE system—an AI-powered framework integrating simulated patients with generative feedback (including alternative phrasings and rationale-based explanations). A randomized controlled trial compared two interventions: simulation-only practice versus simulation augmented with structured generative feedback. Results: Simulation-only practice significantly impaired empathic responding (d = −0.52, p = 0.001), whereas the simulation-plus-feedback condition yielded significant improvements in reflection and questioning skills (d = 0.32–0.39, p < 0.05). Qualitative analysis further confirmed a shift toward client-centered practice. This study provides the first empirical evidence demonstrating the necessity of generative feedback in LLM-augmented counseling training, establishing the “simulation + feedback” dual-component paradigm as a methodologically grounded and scalable approach for AI-enhanced mental health education.
📝 Abstract
Training more counselors, from clinical students to peer supporters, can help meet the demand for accessible mental health support; however, current training approaches remain resource-intensive and difficult to scale effectively. Large Language Models (LLMs) offer promising solutions for growing counseling skills training through simulated practice and automated feedback. Despite successes in aligning LLMs with expert-counselor annotations, we do not know whether LLM-based counseling training tools -- such as AI patients that simulate real-world challenges and generative AI feedback with suggested alternatives and rationales -- actually lead to improvements in novice counselor skill development. We develop CARE, an LLM-simulated practice and feedback system, and randomize 94 novice counselors to practice using an AI patient, either alone or with AI feedback, measuring changes in their behavioral performance, self-assessments, and qualitative learning takeaways. Our results show the practice-and-feedback group improved in their use of reflections and questions (d=0.32-0.39, p$<$0.05). In contrast, the group that practiced with an AI patient alone did not show improvements, and in the case of empathy, actually had worse uses across time (d=$-$0.52, p=0.001) and when compared against the practice-and-feedback group (d=0.72, p=0.001). Participants' qualitative self-reflections revealed key differences: the practice-and-feedback group adopted a client-centered approach involving listening to and validating feelings, while the practice-alone group remained solution-oriented but delayed offering suggestions until gathering more information. Overall, these results suggest that LLM-based training systems can promote effective skill development, but that combining both simulated practice and structured feedback is critical.