DEAP DIVE: Dataset Investigation with Vision transformers for EEG evaluation

📅 2025-10-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the need for low-cost affective computing by investigating emotion recognition performance using sparse-channel EEG signals. To this end, we propose an end-to-end framework integrating Continuous Wavelet Transform (CWT) and Vision Transformer (ViT) on the DEAP dataset: 12-channel EEG time-series signals are first converted into time-frequency representations via CWT; subsequently, ViT directly models spatial-spectral features for quadrant-based emotion classification (valence–arousal space). Experimental results demonstrate that our method achieves 91.57% classification accuracy using only 12 channels—reducing electrode count by 62.5% compared to the standard 32-channel setup—while closely matching full-channel performance. This validates the feasibility of sparse EEG acquisition for affective computing, significantly lowering hardware cost and signal acquisition complexity. The implementation is fully open-sourced to ensure reproducibility.

Technology Category

Application Category

📝 Abstract
Accurately predicting emotions from brain signals has the potential to achieve goals such as improving mental health, human-computer interaction, and affective computing. Emotion prediction through neural signals offers a promising alternative to traditional methods, such as self-assessment and facial expression analysis, which can be subjective or ambiguous. Measurements of the brain activity via electroencephalogram (EEG) provides a more direct and unbiased data source. However, conducting a full EEG is a complex, resource-intensive process, leading to the rise of low-cost EEG devices with simplified measurement capabilities. This work examines how subsets of EEG channels from the DEAP dataset can be used for sufficiently accurate emotion prediction with low-cost EEG devices, rather than fully equipped EEG-measurements. Using Continuous Wavelet Transformation to convert EEG data into scaleograms, we trained a vision transformer (ViT) model for emotion classification. The model achieved over 91,57% accuracy in predicting 4 quadrants (high/low per arousal and valence) with only 12 measuring points (also referred to as channels). Our work shows clearly, that a significant reduction of input channels yields high results compared to state-of-the-art results of 96,9% with 32 channels. Training scripts to reproduce our code can be found here: https://gitlab.kit.edu/kit/aifb/ATKS/public/AutoSMiLeS/DEAP-DIVE.
Problem

Research questions and friction points this paper is trying to address.

Predicting emotions from reduced EEG channels using vision transformers
Evaluating low-cost EEG devices for accurate emotion classification
Achieving high accuracy with minimal EEG measurement points
Innovation

Methods, ideas, or system contributions that make the work stand out.

Using vision transformers for EEG emotion classification
Converting EEG data into scaleograms via wavelet transformation
Achieving high accuracy with reduced EEG input channels
🔎 Similar Papers
No similar papers found.
A
Annemarie Hoffsommer
Karlsruhe Institute of Technology (KIT), Germany
H
Helen Schneider
Karlsruhe Institute of Technology (KIT), Germany
S
Svetlana Pavlitska
Karlsruhe Institute of Technology (KIT), Germany; FZI Research Center for Information Technology, Germany
J. Marius Zöllner
J. Marius Zöllner
Professor at Karlsruhe Institute of Technology (KIT), Director at Forschungszentrum Informatik (FZI)
Intelligent VehiclesAutonomous DrivingRoboticsArtificial IntelligenceMachine Learning