Hypergraph Multi-Modal Learning for EEG-based Emotion Recognition in Conversation

📅 2025-02-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of effectively fusing physiological signals—particularly electroencephalography (EEG)—with audiovisual modalities in Emotion Recognition in Conversation (ERC). We propose the first multimodal ERC framework integrating EEG, audio, and video. Our key innovation is Hyper-MML, a hypergraph-based multimodal learning framework featuring a Multimodal Hypergraph Fusion Module (MHFM) that explicitly models high-order, cross-modal interactions—going beyond conventional graph models limited to pairwise relationships. The framework incorporates EEG time-frequency feature extraction, multimodal feature alignment, and cross-modal attention-based fusion. Evaluated on the EAV dataset, our method achieves a 6.2% absolute accuracy improvement over state-of-the-art approaches. This work establishes a novel, interpretable, and deployable paradigm for辅助 diagnosis of clinical communication disorders—including autism spectrum disorder and depression—by leveraging neurophysiological and behavioral signals in conversational contexts.

Technology Category

Application Category

📝 Abstract
Emotional Recognition in Conversation (ERC) is an important method for diagnosing health conditions such as autism or depression, as well as understanding emotions in individuals who struggle to express their feelings. Current ERC methods primarily rely on complete semantic textual information, including audio and visual data, but face challenges in integrating physiological signals such as electroencephalogram (EEG). This paper proposes a novel Hypergraph Multi-Modal Learning Framework (Hyper-MML), designed to effectively identify emotions in conversation by integrating EEG with audio and video information to capture complex emotional dynamics. Experimental results demonstrate that Hyper-MML significantly outperforms traditional methods in emotion recognition. This is achieved through a Multi-modal Hypergraph Fusion Module (MHFM), which actively models higher-order relationships between multi-modal signals, as validated on the EAV dataset. Our proposed Hyper-MML serves as an effective communication tool for healthcare professionals, enabling better engagement with patients who have difficulty expressing their emotions.
Problem

Research questions and friction points this paper is trying to address.

Integrates EEG with audio and video for emotion recognition.
Addresses challenges in combining physiological signals in ERC.
Improves emotion recognition accuracy using Hypergraph Multi-Modal Learning.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hypergraph Multi-Modal Learning Framework for EEG integration
Multi-modal Hypergraph Fusion Module for higher-order relationships
Enhanced emotion recognition combining EEG, audio, and video
Z
Zijian Kang
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai 201306, China
Y
Yueyang Li
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai 201306, China
S
Shengyu Gong
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai 201306, China
W
Weiming Zeng
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai 201306, China
H
Hongjie Yan
Affiliated Lianyungang Hospital of Xuzhou Medical University, Lianyungang 222002, China
Lingbin Bian
Lingbin Bian
The University of Hong Kong
Bayesian statisticsmachine learningcomputational neurosciencebrain connectivity
Wai Ting Siok
Wai Ting Siok
The Hong Kong Polytechnic University
Reading developmentChinese readingDevelopmental dyslexiaNeuroimagingfMRI
Nizhuan Wang
Nizhuan Wang
The Hong Kong Polytechnic University (PolyU)
AIBrain-Computer InterfaceNeuroimagingComputational LinguisticsNeurolinguistics