Dual-branch Graph Domain Adaptation for Cross-scenario Multi-modal Emotion Recognition

📅 2026-03-27

📈 Citations: 0

✨ Influential: 0

career value

208K/year

🤖 AI Summary

This work addresses the limited generalization of multimodal conversational emotion recognition across diverse scenarios, where variations in speakers, topics, styles, and noise degrade performance. To tackle this challenge, the authors propose a dual-branch graph-based domain adaptation framework that jointly models domain adaptation and robustness to label noise for the first time. The method constructs an emotion interaction hypergraph and employs a dual-branch encoder to capture both local multi-way relationships and global dependencies. A domain adversarial discriminator is integrated to learn domain-invariant representations, while a regularization loss mitigates the adverse effects of label noise. Theoretical analysis yields a tighter generalization bound. Extensive experiments on IEMOCAP and MELD demonstrate that the proposed model significantly outperforms strong baselines, achieving superior cross-scenario emotion recognition and generalization capabilities.

Technology Category

Application Category

📝 Abstract

Multimodal Emotion Recognition in Conversations (MERC) aims to predict speakers' emotional states in multi-turn dialogues through text, audio, and visual cues. In real-world settings, conversation scenarios differ significantly in speakers, topics, styles, and noise levels. Existing MERC methods generally neglect these cross-scenario variations, limiting their ability to transfer models trained on a source domain to unseen target domains. To address this issue, we propose a Dual-branch Graph Domain Adaptation framework (DGDA) for multimodal emotion recognition under cross-scenario conditions. We first construct an emotion interaction graph to characterize complex emotional dependencies among utterances. A dual-branch encoder, consisting of a hypergraph neural network (HGNN) and a path neural network (PathNN), is then designed to explicitly model multivariate relationships and implicitly capture global dependencies. To enable out-of-domain generalization, a domain adversarial discriminator is introduced to learn invariant representations across domains. Furthermore, a regularization loss is incorporated to suppress the negative influence of noisy labels. To the best of our knowledge, DGDA is the first MERC framework that jointly addresses domain shift and label noise. Theoretical analysis provides tighter generalization bounds, and extensive experiments on IEMOCAP and MELD demonstrate that DGDA consistently outperforms strong baselines and better adapts to cross-scenario conversations. Our code is available at https://github.com/Xudmm1239439/DGDA-Net.

Problem

Research questions and friction points this paper is trying to address.

Multimodal Emotion Recognition

Cross-scenario

Domain Adaptation

Label Noise

Conversation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph Domain Adaptation

Hypergraph Neural Network

Path Neural Network