CognitionCapturerPro: Towards High-Fidelity Visual Decoding from EEG/MEG via Multi-modal Information and Asymmetric Alignment

📅 2026-03-13

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work addresses the fidelity loss and representational shift inherent in reconstructing visual stimuli from EEG/MEG signals by proposing a collaborative training framework that integrates multimodal priors—specifically image, text, depth, and edge cues. The approach combines a streamlined alignment module with a pretrained diffusion model and introduces an uncertainty-weighted similarity scoring mechanism to quantify the fidelity of each modality. Furthermore, a fusion encoder is designed to integrate shared representations across modalities, enabling more precise cross-modal alignment. Evaluated on the THINGS-EEG dataset, the method achieves substantial improvements over the state-of-the-art CognitionCapturer, with Top-1 and Top-5 retrieval accuracy gains of 25.9% and 10.6%, respectively.

Technology Category

Application Category

📝 Abstract

Visual stimuli reconstruction from EEG remains challenging due to fidelity loss and representation shift. We propose CognitionCapturerPro, an enhanced framework that integrates EEG with multi-modal priors (images, text, depth, and edges) via collaborative training. Our core contributions include an uncertainty-weighted similarity scoring mechanism to quantify modality-specific fidelity and a fusion encoder for integrating shared representations. By employing a simplified alignment module and a pre-trained diffusion model, our method significantly outperforms the original CognitionCapturer on the THINGS-EEG dataset, improving Top-1 and Top-5 retrieval accuracy by 25.9% and 10.6%, respectively. Code is available at: https://github.com/XiaoZhangYES/CognitionCapturerPro.

Problem

Research questions and friction points this paper is trying to address.

visual decoding

EEG

fidelity loss

representation shift

visual reconstruction

Innovation

Methods, ideas, or system contributions that make the work stand out.

multi-modal fusion

uncertainty-weighted alignment

EEG-based visual reconstruction