Multimodal Functional Maximum Correlation for Emotion Recognition

📅 2025-12-28

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing self-supervised methods for multimodal physiological signal-based affect recognition (e.g., EEG, EDA) struggle to model high-order cross-modal dependencies and rely solely on pairwise alignment, neglecting joint dynamic interactions across modalities. To address this, we propose the first unpaired self-supervised framework explicitly designed for learning high-order joint dependencies. Our method introduces a novel Dual Total Correlation (DTC) objective, integrating Functional Maximum Correlation Analysis (FMCA) with trace upper-bound optimization to directly maximize high-order statistical dependence between central (brain) and autonomic nervous system responses—without requiring modality-wise temporal alignment. This yields more discriminative representations. On CEAP-360VR, our approach improves subject-specific accuracy by 7.9% and cross-subject EDA-only performance by 5.6%. On the most challenging cross-subject EEG task of MAHNOB-HCI, it achieves 98.2% accuracy—only 0.8 percentage points below the current state-of-the-art.

Technology Category

Application Category

📝 Abstract

Emotional states manifest as coordinated yet heterogeneous physiological responses across central and autonomic systems, posing a fundamental challenge for multimodal representation learning in affective computing. Learning such joint dynamics is further complicated by the scarcity and subjectivity of affective annotations, which motivates the use of self-supervised learning (SSL). However, most existing SSL approaches rely on pairwise alignment objectives, which are insufficient to characterize dependencies among more than two modalities and fail to capture higher-order interactions arising from coordinated brain and autonomic responses. To address this limitation, we propose Multimodal Functional Maximum Correlation (MFMC), a principled SSL framework that maximizes higher-order multimodal dependence through a Dual Total Correlation (DTC) objective. By deriving a tight sandwich bound and optimizing it using a functional maximum correlation analysis (FMCA) based trace surrogate, MFMC captures joint multimodal interactions directly, without relying on pairwise contrastive losses. Experiments on three public affective computing benchmarks demonstrate that MFMC consistently achieves state-of-the-art or competitive performance under both subject-dependent and subject-independent evaluation protocols, highlighting its robustness to inter-subject variability. In particular, MFMC improves subject-dependent accuracy on CEAP-360VR from 78.9% to 86.8%, and subject-independent accuracy from 27.5% to 33.1% using the EDA signal alone. Moreover, MFMC remains within 0.8 percentage points of the best-performing method on the most challenging EEG subject-independent split of MAHNOB-HCI. Our code is available at https://github.com/DY9910/MFMC.

Problem

Research questions and friction points this paper is trying to address.

Capturing higher-order multimodal dependencies in emotion recognition systems

Addressing limitations of pairwise alignment in self-supervised affective computing

Overcoming scarcity and subjectivity of emotional annotations through SSL

Innovation

Methods, ideas, or system contributions that make the work stand out.

MFMC maximizes higher-order multimodal dependence via Dual Total Correlation objective

It uses functional maximum correlation analysis to optimize trace surrogate

Framework captures joint interactions without pairwise contrastive losses

🔎 Similar Papers

No similar papers found.