🤖 AI Summary
To address the challenges of poor cross-subject and cross-session generalization, scarcity of high-quality EEG data, and temporal confounds induced by block-design paradigms in EEG-based visual decoding, this work introduces EEG-ImageNet—the first large-scale, multi-granularity annotated EEG dataset for visual decoding. It comprises high-temporal-resolution EEG responses from 16 subjects viewing 4,000 ImageNet images, annotated with both coarse-grained (category-level) and fine-grained (sub-category-level) semantic labels—five times larger than existing benchmarks—and establishes the first standardized benchmark for EEG visual decoding. Methodologically, it integrates portable EEG acquisition with hybrid CNN/Transformer architectures and unifies evaluation across classification (accuracy) and reconstruction (two-way identification rate) tasks. The best-performing model achieves 60% category-level classification accuracy and 64% reconstruction identification rate—setting new state-of-the-art performance and advancing portable visual BCIs and computational modeling of biological vision.
📝 Abstract
Identifying and reconstructing what we see from brain activity gives us a special insight into investigating how the biological visual system represents the world. While recent efforts have achieved high-performance image classification and high-quality image reconstruction from brain signals collected by Functional Magnetic Resonance Imaging (fMRI) or magnetoencephalogram (MEG), the expensiveness and bulkiness of these devices make relevant applications difficult to generalize to practical applications. On the other hand, Electroencephalography (EEG), despite its advantages of ease of use, cost-efficiency, high temporal resolution, and non-invasive nature, has not been fully explored in relevant studies due to the lack of comprehensive datasets. To address this gap, we introduce EEG-ImageNet, a novel EEG dataset comprising recordings from 16 subjects exposed to 4000 images selected from the ImageNet dataset. EEG-ImageNet consists of 5 times EEG-image pairs larger than existing similar EEG benchmarks. EEG-ImageNet is collected with image stimuli of multi-granularity labels, i.e., 40 images with coarse-grained labels and 40 with fine-grained labels. Based on it, we establish benchmarks for object classification and image reconstruction. Experiments with several commonly used models show that the best models can achieve object classification with accuracy around 60% and image reconstruction with two-way identification around 64%. These results demonstrate the dataset's potential to advance EEG-based visual brain-computer interfaces, understand the visual perception of biological systems, and provide potential applications in improving machine visual models.