🤖 AI Summary
To address data scarcity in EEG-based emotion recognition for brain–computer interfaces, this paper introduces, for the first time, Conditional Denoising Diffusion Probabilistic Models (CDDPMs) for synthetic EEG signal generation. By incorporating controllable noise injection, CDDPMs produce high-fidelity, temporally coherent physiological signals with consistent class labels. Compared to GANs and standard DDPMs, the proposed method significantly improves the generalizability and classification transferability of generated samples, overcoming key limitations of conventional generative models in modeling EEG time-series dynamics. Extensive evaluation on the DEAP dataset—via fine-tuning and validation across SVM, LSTM, and Transformer classifiers—demonstrates up to a 4.21% absolute improvement in emotion classification accuracy. Moreover, under low-data regimes, synthetically augmented training sets substantially enhance downstream model robustness and generalization capability.
📝 Abstract
Emotions are crucial in human life, influencing perceptions, relationships, behaviour, and choices. Emotion recognition using Electroencephalography (EEG) in the Brain-Computer Interface (BCI) domain presents significant challenges, particularly the need for extensive datasets. This study aims to generate synthetic EEG samples that are similar to real samples but are distinct by augmenting noise to a conditional denoising diffusion probabilistic model, thus addressing the prevalent issue of data scarcity in EEG research. The proposed method is tested on the DEAP dataset, showcasing upto 4.21% improvement in classification performance when using synthetic data. This is higher compared to the traditional GAN-based and DDPM-based approaches. The proposed diffusion-based approach for EEG data generation appears promising in refining the accuracy of emotion recognition systems and marks a notable contribution to EEG-based emotion recognition. Our research further evaluates the effectiveness of state-of-the-art classifiers on EEG data, employing both real and synthetic data with varying noise levels.