🤖 AI Summary
Existing multimodal emotion recognition approaches often overlook the brain-region specificity and neuroscientific interpretability of EEG signals, resulting in suboptimal fusion performance. To address this limitation, this work proposes BiMoE, a novel framework that uniquely integrates brain-topology-aware EEG partitioning with a Mixture-of-Experts mechanism. Specifically, dedicated expert networks are assigned to distinct brain regions to capture both local and global spatiotemporal features, while separate experts process peripheral physiological signals. Dynamic fusion is achieved through adaptive routing and a joint loss function. Moving beyond conventional black-box paradigms, BiMoE demonstrates superior performance on the DEAP and DREAMER datasets, achieving cross-subject classification accuracy improvements of 0.87%–5.19% over state-of-the-art baselines, while maintaining high interpretability alongside enhanced precision.
📝 Abstract
Multimodal Sentiment Analysis (MSA) that integrates Electroencephalogram (EEG) with peripheral physiological signals (PPS) is crucial for the development of brain-computer interface (BCI) systems. However, existing methods encounter three major challenges: (1) overlooking the region-specific characteristics of affective processing by treating EEG signals as homogeneous; (2) treating EEG as a black-box input, which lacks interpretability into neural representations;(3) ineffective fusion of EEG features with complementary PPS features. To overcome these issues, we propose BiMoE, a novel brain-inspired mixture of experts framework. BiMoE partitions EEG signals in a brain-topology-aware manner, with each expert utilizing a dual-stream encoder to extract local and global spatiotemporal features. A dedicated expert handles PPS using multi-scale large-kernel convolutions. All experts are dynamically fused through adaptive routing and a joint loss function. Evaluated under strict subject-independent settings, BiMoE consistently surpasses state-of-the-art baselines across various affective dimensions. On the DEAP and DREAMER datasets, it yields average accuracy improvements of 0.87% to 5.19% in multimodal sentiment classification. The code is available at: https://github.com/HongyuZhu-s/BiMo.