Subject-Aware Multi-Granularity Alignment for Zero-Shot EEG-to-Image Retrieval

📅 2026-04-20
📈 Citations: 0
Influential: 0
📄 PDF

career value

230K/year
🤖 AI Summary
Existing zero-shot EEG-to-image retrieval methods suffer from limited cross-subject generalization due to their neglect of individual differences in multi-granularity representations of EEG signals. This work proposes SAMGA, a novel framework that introduces, for the first time, a subject-aware multi-granularity visual supervision objective combined with a coarse-to-fine cross-modal alignment strategy. By leveraging intermediate-layer features from a pretrained visual encoder through adaptive aggregation, SAMGA simultaneously enhances semantic geometric stability and instance discriminability within a shared encoder, effectively balancing subject-specific neural response characteristics with cross-subject generalizability. Evaluated on the THINGS-EEG benchmark, the method achieves intra-subject Top-1 and Top-5 retrieval accuracies of 91.3% and 98.8%, respectively, and cross-subject accuracies of 34.4% and 64.8%, significantly outperforming current state-of-the-art approaches.

Technology Category

Application Category

📝 Abstract
Zero-shot EEG-to-image retrieval aims to decode perceived visual content from electroencephalography (EEG) by aligning neural responses with pretrained visual representations, providing a promising route toward scalable visual neural decoding and practical brain-computer interfaces. However, robust EEG-to-image retrieval remains challenging, because prior methods usually rely on either a single fixed visual target or a subject-invariant target construction scheme. Such designs overlook two important properties of visually evoked EEG signals: they preserve information across multiple representational scales, and the visual granularity best matched to EEG may vary across subjects. To address these issues, subject-aware multi-granularity alignment (SAMGA) framework is proposed for zero-shot EEG-to-image retrieval. SAMGA first constructs a subject-aware visual supervision target by adaptively aggregating multiple intermediate representations from a pretrained vision encoder, allowing the model to absorb subject-dependent granularity deviations during training while preserving subject-agnostic inference. Building on this adaptive target construction, a coarse-to-fine cross-modal alignment strategy is further designed with a shared encoder wherein the coarse stage stabilizes the shared semantic geometry and reduces subject-induced distribution shift, and the fine stage further improves instance-level retrieval discrimination. Extensive experiments on the THINGS-EEG benchmark demonstrate that the proposed method achieves 91.3% Top-1 and 98.8% Top-5 accuracy in the intra-subject setting, and 34.4% Top-1 and 64.8% Top-5 accuracy in the inter-subject setting, outperforming recent state-of-the-art methods.
Problem

Research questions and friction points this paper is trying to address.

zero-shot EEG-to-image retrieval
multi-granularity alignment
subject-aware modeling
visual neural decoding
EEG signal variability
Innovation

Methods, ideas, or system contributions that make the work stand out.

subject-aware alignment
multi-granularity representation
zero-shot EEG-to-image retrieval
cross-modal alignment
visual neural decoding
L
Lin Jiang
School of Automation, Hangzhou Dianzi University; Zhejiang Provincial Key Laboratory of Brain Computer Collaborative Intelligence Technology and Applications, Hangzhou, Zhejiang 310018, China
Q
Qingshan She
School of Automation, Hangzhou Dianzi University; Zhejiang Provincial Key Laboratory of Brain Computer Collaborative Intelligence Technology and Applications, Hangzhou, Zhejiang 310018, China
Jiale Xu
Jiale Xu
Tencent ARC Lab
Generative Models3D Generation3D Reconstruction
H
Haiqi Xu
School of Automation, Hangzhou Dianzi University; Zhejiang Provincial Key Laboratory of Brain Computer Collaborative Intelligence Technology and Applications, Hangzhou, Zhejiang 310018, China
D
Duanpo Wu
School of Automation, Hangzhou Dianzi University; Zhejiang Provincial Key Laboratory of Brain Computer Collaborative Intelligence Technology and Applications, Hangzhou, Zhejiang 310018, China
Z
Zhenzhong Kuang
College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China