Multi-dataset Joint Pre-training of Emotional EEG Enables Generalizable Affective Computing

📅 2025-10-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address three key challenges in cross-dataset affective EEG recognition—distribution shift, inconsistent affect label definitions across datasets, and high inter-subject variability—this paper proposes a multi-dataset joint pretraining framework. Methodologically, it introduces a novel cross-dataset covariance alignment loss and a hybrid encoder integrating channel-wise Mamba-like linear attention with spatiotemporal dynamic modeling, enabling robust second-order statistical feature alignment and calibration-free recognition. Crucially, the method operates without requiring labeled calibration data from the target domain. Experimental results demonstrate substantial improvements: an average 4.57% gain in AUROC for few-shot emotion recognition and an 11.92% increase in zero-shot transfer accuracy. Moreover, scaling the pretraining dataset consistently enhances performance, achieving up to an 8.55% improvement over single-dataset training baselines.

Technology Category

Application Category

📝 Abstract
Task-specific pre-training is essential when task representations diverge from generic pre-training features. Existing task-general pre-training EEG models struggle with complex tasks like emotion recognition due to mismatches between task-specific features and broad pre-training approaches. This work aims to develop a task-specific multi-dataset joint pre-training framework for cross-dataset emotion recognition, tackling problems of large inter-dataset distribution shifts, inconsistent emotion category definitions, and substantial inter-subject variability. We introduce a cross-dataset covariance alignment loss to align second-order statistical properties across datasets, enabling robust generalization without the need for extensive labels or per-subject calibration. To capture the long-term dependency and complex dynamics of EEG, we propose a hybrid encoder combining a Mamba-like linear attention channel encoder and a spatiotemporal dynamics model. Our method outperforms state-of-the-art large-scale EEG models by an average of 4.57% in AUROC for few-shot emotion recognition and 11.92% in accuracy for zero-shot generalization to a new dataset. Performance scales with the increase of datasets used in pre-training. Multi-dataset joint pre-training achieves a performance gain of 8.55% over single-dataset training. This work provides a scalable framework for task-specific pre-training and highlights its benefit in generalizable affective computing. Our code is available at https://github.com/ncclab-sustech/mdJPT_nips2025.
Problem

Research questions and friction points this paper is trying to address.

Develop task-specific EEG pre-training for emotion recognition
Address dataset distribution shifts and inconsistent emotion definitions
Improve generalization across subjects and datasets with limited labels
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multi-dataset joint pre-training for emotion recognition
Cross-dataset covariance alignment loss for generalization
Hybrid encoder combining Mamba-like attention with spatiotemporal dynamics
🔎 Similar Papers
No similar papers found.
Q
Qingzhu Zhang
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China
J
Jiani Zhong
School of Data Science, University of California, San Diego, La Jolla, CA, USA
Z
Zongsheng Li
School of Science and Engineering, The Chinese University of Hong Kong, Shenzhen, China
Xinke Shen
Xinke Shen
Southern University of Science and Technology
Affective Brain Computer Interface
Q
Quanying Liu
Department of Biomedical Engineering, Southern University of Science and Technology, Shenzhen, China