π€ AI Summary
This study addresses the challenge of degraded similarity measurement in cross-subject EEG-based emotion recognition caused by inter-subject variability and temporal response asynchrony. To overcome this, the authors propose a Temporal Asynchronous Alignment Contrastive Learning framework (TAΒ²CL), which abandons conventional global hard alignment in favor of a ColBERT-inspired late interaction mechanism. This enables fine-grained, local dynamic matching that adaptively aligns highly correlated EEG segments across subjects. Evaluated on the FACED, SEED, and SEED-V datasets, the method achieves classification accuracies of 64.5% (9-class), 79.5%, 86.4%, and 70.1%, respectively, demonstrating substantial improvements in both performance and generalization for cross-subject emotion recognition.
π Abstract
With the advancement of science and technology, the importance of emotion research has become increasingly evident. Electroencephalography (EEG)-based emotion recognition has emerged as an active research area in recent years, owing to its objectivity and high temporal resolution. However, most existing methods focus on optimizing encoder structures to enhance feature extraction capabilities, while paying relatively little attention to similarity calculation strategies, particularly overlooking the potential temporal misalignment of responses among different subjects. To address these shortcomings, this paper draws inspiration from the late interaction mechanism of ColBERT in natural language processing (NLP) and proposes a Temporal Asynchronous Alignment-based Contrastive Learning (TA2CL) framework. This method transforms the traditional global "hard alignment" similarity calculation approach into a fine-grained local matching mechanism, enabling the model to adaptively search for and align "locally highly correlated" segments between two EEG signals, thereby effectively mitigating the effects of inter-subject differences and temporal delays. Experimental results demonstrate that the proposed method achieves strong performance across multiple public datasets. Specifically, on the FACED dataset, it achieves an accuracy of 64.5% for the nine-class classification task and 79.5% for the binary classification task, while on the SEED and SEED-V datasets, it achieves accuracies of 86.4% and 70.1%, respectively, validating the method's effectiveness and generalization capability.