🤖 AI Summary
Existing empathetic dialogue systems lack a unified strategic framework and explicit reasoning mechanisms, hindering their ability to model the cognitive complexity of empathy. This work proposes STRIDE-ED, a novel framework that introduces, for the first time, a strategy-anchored, interpretable multi-stage reasoning mechanism to achieve structured alignment across emotion, strategy, and response format. We develop a strategy-aware data refinement pipeline leveraging large language model–assisted annotation, dynamic sampling, and multi-model consistency–weighted evaluation, followed by a two-stage training paradigm combining supervised fine-tuning and multi-objective reinforcement learning. Experimental results demonstrate that our approach significantly outperforms current systems in both automatic metrics and human evaluations, exhibiting strong empathetic capabilities and robust cross-model generalization.
📝 Abstract
Empathetic dialogue requires not only recognizing a user's emotional state but also making strategy-aware, context-sensitive decisions throughout response generation. However, the lack of a comprehensive empathy strategy framework, explicit task-aligned multi-stage reasoning, and high-quality strategy-aware data fundamentally limits existing approaches, preventing them from effectively modeling empathetic dialogue as a complex, multi-stage cognitive and decision-making process. To address these challenges, we propose STRIDE-ED, a STRategy-grounded, Interpretable, and DEep reasoning framework that models Empathetic Dialogue through structured, strategy-conditioned reasoning. To support effective learning, we develop a strategy-aware data refinement pipeline integrating LLM-based annotation, multi-model consistency-weighted evaluation, and dynamic sampling to construct high-quality training data aligned with empathetic strategies. Furthermore, we adopt a two-stage training paradigm that combines supervised fine-tuning with multi-objective reinforcement learning to better align model behaviors with target emotions, empathetic strategies, and response formats. Extensive experiments demonstrate that STRIDE-ED generalizes across diverse open-source LLMs and consistently outperforms existing methods on both automatic metrics and human evaluations.