Reflecting Twice before Speaking with Empathy: Self-Reflective Alternating Inference for Empathy-Aware End-to-End Spoken Dialogue

📅 2026-01-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Current end-to-end spoken dialogue systems rely on rigid supervisory signals, making it difficult to capture the complexity and diversity of empathetic expression. To address this limitation, this work proposes ReEmpathy—an end-to-end dialogue model endowed with empathetic self-reflection capabilities. ReEmpathy dynamically perceives and refines its empathetic responses through an alternating process of spoken response generation and free-form empathetic reflection reasoning. The core innovation lies in the introduction of EmpathyEval, the first natural language–based empathy evaluation model, which drives the self-reflection mechanism. Experimental results demonstrate that ReEmpathy significantly enhances the quality of empathy-sensitive dialogues, offering a novel paradigm for developing emotionally intelligent human–computer interaction systems.

Technology Category

Application Category

📝 Abstract
End-to-end Spoken Language Models (SLMs) hold great potential for paralinguistic perception, and numerous studies have aimed to enhance their capabilities, particularly for empathetic dialogue. However, current approaches largely depend on rigid supervised signals, such as ground-truth response in supervised fine-tuning or preference scores in reinforcement learning. Such reliance is fundamentally limited for modeling complex empathy, as there is no single"correct"response and a simple numerical score cannot fully capture the nuances of emotional expression or the appropriateness of empathetic behavior. To address these limitations, we sequentially introduce EmpathyEval, a descriptive natural-language-based evaluation model for assessing empathetic quality in spoken dialogues. Building upon EmpathyEval, we propose ReEmpathy, an end-to-end SLM that enhances empathetic dialogue through a novel Empathetic Self-Reflective Alternating Inference mechanism, which interleaves spoken response generation with free-form, empathy-related reflective reasoning. Extensive experiments demonstrate that ReEmpathy substantially improves empathy-sensitive spoken dialogue by enabling reflective reasoning, offering a promising approach toward more emotionally intelligent and empathy-aware human-computer interactions.
Problem

Research questions and friction points this paper is trying to address.

empathy
spoken dialogue
end-to-end spoken language models
paralinguistic perception
empathetic evaluation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Reflective Inference
Empathy-Aware Dialogue
Spoken Language Models
Natural-Language Evaluation
Empathetic Reasoning
🔎 Similar Papers
No similar papers found.
Y
Yuhang Jia
College of Computer Science, Nankai University
P
Pei Liu
Meituan LongCat Interaction Team
Haoqin Sun
Haoqin Sun
Nankai University
Affective computingSpeech signal processingAudio understanding
Jiaming Zhou
Jiaming Zhou
Nankai University
Automatic Speech RecognitionSpeech processing
Xuxin Cheng
Xuxin Cheng
University of California, San Diego
C
Cao Liu
Meituan LongCat Interaction Team
K
Ke Zeng
Meituan LongCat Interaction Team
X
Xunliang Cai
Meituan LongCat Interaction Team
Yong Qin
Yong Qin
Nankai University
speech technologiesAI