🤖 AI Summary
This study addresses key limitations in conversational agents—namely, delayed emotion recognition, poor individual adaptability, and distorted emotional expression. We propose the first framework that closed-loop integrates electroencephalography (EEG) with multimodal physiological signals (e.g., heart rate, galvanic skin response) into the perception–response pipeline of dialogue agents, enabling real-time empathic interaction. Methodologically, we combine deep learning–based emotion decoding, cross-modal emotion alignment, and physiology-driven co-rendering of facial expressions and prosody to achieve millisecond-level emotion tracking and synchronized generation of dynamic vocal intonation and micro-expressions. User studies demonstrate a 37% increase in emotional arousal intensity and significantly enhanced interaction engagement, empirically validating the efficacy and feasibility of leveraging neurophysiological signals to support fine-grained, low-latency empathic communication in naturalistic dialogue settings.
📝 Abstract
Conversational agents (CAs) are revolutionizing human-computer interaction by evolving from text-based chatbots to empathetic digital humans (DHs) capable of rich emotional expressions. This paper explores the integration of neural and physiological signals into the perception module of CAs to enhance empathetic interactions. By leveraging these cues, the study aims to detect emotions in real-time and generate empathetic responses and expressions. We conducted a user study where participants engaged in conversations with a DH about emotional topics. The DH responded and displayed expressions by mirroring detected emotions in real-time using neural and physiological cues. The results indicate that participants experienced stronger emotions and greater engagement during interactions with the Empathetic DH, demonstrating the effectiveness of incorporating neural and physiological signals for real-time emotion recognition. However, several challenges were identified, including recognition accuracy, emotional transition speeds, individual personality effects, and limitations in voice tone modulation. Addressing these challenges is crucial for further refining Empathetic DHs and fostering meaningful connections between humans and artificial entities. Overall, this research advances human-agent interaction and highlights the potential of real-time neural and physiological emotion recognition in creating empathetic DHs.