🤖 AI Summary
Existing psychological client simulators struggle to replicate resistance behaviors observed in real counseling due to excessive compliance, thereby limiting their training efficacy. This work proposes ResistClient, the first client simulator incorporating a motivation-based reasoning mechanism to generate psychologically coherent and challenging responses. ResistClient employs a two-stage framework, RIMR, which first mitigates compliance bias and then models intrinsic motivation to produce authentic resistance. The approach integrates supervised fine-tuning, a large-scale resistance-oriented dialogue dataset (RPC), and process-supervised reinforcement learning to jointly optimize behavioral realism and reasoning coherence. Both automatic metrics and expert evaluations demonstrate that ResistClient significantly outperforms existing simulators in challenge fidelity, behavioral plausibility, and reasoning consistency, effectively enabling robust evaluation of psychological large language models in difficult therapeutic scenarios.
📝 Abstract
Psychological client simulators have emerged as a scalable solution for training and evaluating counselor trainees and psychological LLMs. Yet existing simulators exhibit unrealistic over-compliance, leaving counselors underprepared for the challenging behaviors common in real-world practice. To bridge this gap, we present ResistClient, which systematically models challenging client behaviors grounded in Client Resistance Theory by integrating external behaviors with underlying motivational mechanisms. To this end, we propose Resistance-Informed Motivation Reasoning (RIMR), a two-stage training framework. First, RIMR mitigates compliance bias via supervised fine-tuning on RPC, a large-scale resistance-oriented psychological conversation dataset covering diverse client profiles. Second, beyond surface-level response imitation, RIMR models psychologically coherent motivation reasoning before response generation, jointly optimizing motivation authenticity and response consistency via process-supervised reinforcement learning. Extensive automatic and expert evaluations show that ResistClient substantially outperforms existing simulators in challenge fidelity, behavioral plausibility, and reasoning coherence. Moreover, ResistClient facilities evaluation of psychological LLMs under challenging conditions, offering new optimization directions for mental health dialogue systems.