🤖 AI Summary
Existing empathetic response generation methods suffer from three key limitations: insufficient decoupling of emotion and intent—leading to poor controllability; heavy reliance on large language models (LLMs), resulting in high computational overhead; and erroneous emotion recognition, causing response inconsistency. This paper proposes EmoIntent, a lightweight diffusion-based framework introducing the first dual-reflection mechanism (Exploring–Sampling–Correcting) for joint emotion-intent modeling. It employs an emotion-reasoning mask for fine-grained emotion calibration and integrates intent-mimicking reinforcement learning with emotion-contagion augmentation to enable bidirectional mapping—emotion→intent and intent→emotion—within the diffusion process. EmoIntent eliminates dependence on full-parameter LLMs, achieving both efficiency and controllability. Extensive automatic and human evaluations demonstrate state-of-the-art performance, with significant improvements in response relevance (+12.7%), empathetic controllability (+18.3%), and information richness (+9.5%).
📝 Abstract
Empathetic response generation necessitates the integration of emotional and intentional dynamics to foster meaningful interactions. Existing research either neglects the intricate interplay between emotion and intent, leading to suboptimal controllability of empathy, or resorts to large language models (LLMs), which incur significant computational overhead. In this paper, we introduce ReflectDiffu, a lightweight and comprehensive framework for empathetic response generation. This framework incorporates emotion contagion to augment emotional expressiveness and employs an emotion-reasoning mask to pinpoint critical emotional elements. Additionally, it integrates intent mimicry within reinforcement learning for refinement during diffusion. By harnessing an intent twice reflect the mechanism of Exploring-Sampling-Correcting, ReflectDiffu adeptly translates emotional decision-making into precise intent actions, thereby addressing empathetic response misalignments stemming from emotional misrecognition. Through reflection, the framework maps emotional states to intents, markedly enhancing both response empathy and flexibility. Comprehensive experiments reveal that ReflectDiffu outperforms existing models regarding relevance, controllability, and informativeness, achieving state-of-the-art results in both automatic and human evaluations.