Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

📅 2026-03-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limited generalization of multimodal conversational emotion recognition across diverse scenarios, where variations in speakers, topics, styles, and noise degrade performance. To tackle this challenge, the authors propose a dual-branch graph-based domain adaptation framework that jointly models domain adaptation and robustness to label noise for the first time. The method constructs an emotion interaction hypergraph and employs a dual-branch encoder to capture both local multi-way relationships and global dependencies. A domain adversarial discriminator is integrated to learn domain-invariant representations, while a regularization loss mitigates the adverse effects of label noise. Theoretical analysis yields a tighter generalization bound. Extensive experiments on IEMOCAP and MELD demonstrate that the proposed model significantly outperforms strong baselines, achieving superior cross-scenario emotion recognition and generalization capabilities.
📝 Abstract
Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers' emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation scenarios differ significantly in speakers, topics, styles, and noise levels. Existing MERC methods generally neglect these cross-scenario variations, limiting their ability to transfer models trained on a source domain to unseen target domains. To address this issue, we propose a Dual-branch Graph Domain Adaptation framework (DGDA) for multimodal emotion recognition under cross-scenario conditions. We first construct an emotion interaction graph to characterize complex emotional dependencies among utterances. A dual-branch encoder, consisting of a hypergraph neural network (HGNN) and a path neural network (PathNN), is then designed to explicitly model multivariate relationships and implicitly capture global dependencies. To enable out-of-domain generalization, a domain adversarial discriminator is introduced to learn invariant representations across domains. Furthermore, a regularization loss is incorporated to suppress the negative influence of noisy labels. To the best of our knowledge, DGDA is the first MERC framework that jointly addresses domain shift and label noise. Theoretical analysis provides tighter generalization bounds, and extensive experiments on IEMOCAP and MELD demonstrate that DGDA consistently outperforms strong baselines and better adapts to cross-scenario conversations. Our code is available at https://github.com/Xudmm1239439/DGDA-Net.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Emotion Recognition
Cross-scenario
Domain Adaptation
Label Noise
Conversation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Domain Adaptation
Hypergraph Neural Network
Path Neural Network
Cross-scenario Emotion Recognition
Label Noise Robustness
🔎 Similar Papers
No similar papers found.