Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing

📅 2025-10-25

📈 Citations: 0

✨ Influential: 0

career value

218K/year

🤖 AI Summary

To address three key challenges in cross-dataset affective EEG recognition—distribution shift, inconsistent affect label definitions across datasets, and high inter-subject variability—this paper proposes a multi-dataset joint pretraining framework. Methodologically, it introduces a novel cross-dataset covariance alignment loss and a hybrid encoder integrating channel-wise Mamba-like linear attention with spatiotemporal dynamic modeling, enabling robust second-order statistical feature alignment and calibration-free recognition. Crucially, the method operates without requiring labeled calibration data from the target domain. Experimental results demonstrate substantial improvements: an average 4.57% gain in AUROC for few-shot emotion recognition and an 11.92% increase in zero-shot transfer accuracy. Moreover, scaling the pretraining dataset consistently enhances performance, achieving up to an 8.55% improvement over single-dataset training baselines.

Technology Category

Application Category

📝 Abstract

Task-specific pre-training is essential when task representations diverge from generic pre-training features. Existing task-general pre-training EEG models struggle with complex tasks like emotion recognition due to mismatches between task-specific features and broad pre-training approaches. This work aims to develop a task-specific multi-dataset joint pre-training framework for cross-dataset emotion recognition, tackling problems of large inter-dataset distribution shifts, inconsistent emotion category definitions, and substantial inter-subject variability. We introduce a cross-dataset covariance alignment loss to align second-order statistical properties across datasets, enabling robust generalization without the need for extensive labels or per-subject calibration. To capture the long-term dependency and complex dynamics of EEG, we propose a hybrid encoder combining a Mamba-like linear attention channel encoder and a spatiotemporal dynamics model. Our method outperforms state-of-the-art large-scale EEG models by an average of 4.57% in AUROC for few-shot emotion recognition and 11.92% in accuracy for zero-shot generalization to a new dataset. Performance scales with the increase of datasets used in pre-training. Multi-dataset joint pre-training achieves a performance gain of 8.55% over single-dataset training. This work provides a scalable framework for task-specific pre-training and highlights its benefit in generalizable affective computing. Our code is available at https://github.com/ncclab-sustech/mdJPT_nips2025.

Problem

Research questions and friction points this paper is trying to address.

Develop task-specific EEG pre-training for emotion recognition

Address dataset distribution shifts and inconsistent emotion definitions

Improve generalization across subjects and datasets with limited labels

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-dataset joint pre-training for emotion recognition

Cross-dataset covariance alignment loss for generalization

Hybrid encoder combining Mamba-like attention with spatiotemporal dynamics

🔎 Similar Papers

No similar papers found.