Towards Interpretable Visual Decoding with Attention to Brain Representations

📅 2025-09-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing fMRI-to-image decoding methods typically rely on intermediate feature spaces (e.g., image or text embeddings), obscuring the dynamic, region-specific contributions of cortical areas to the generative process. To address this, we propose NeuroAdapter—a novel framework that enables end-to-end, direct conditioning of latent diffusion models (LDMs) on fMRI signals, bypassing intermediate representations to preserve neural information fidelity. Furthermore, we introduce IBBI (Interpretable Bidirectional Brain–Image), a bidirectional interpretability framework that quantitatively characterizes the spatiotemporal modulation of generation stages by distinct brain regions through analysis of cross-attention weight distributions across diffusion timesteps. Evaluated on public fMRI datasets, our method achieves reconstruction quality competitive with state-of-the-art approaches while substantially enhancing the interpretability of neural–image correspondences. This work establishes a new paradigm for brain–computer interfaces and computational neuroscience by unifying high-fidelity neural decoding with mechanistic, process-level interpretability.

Technology Category

Application Category

📝 Abstract
Recent work has demonstrated that complex visual stimuli can be decoded from human brain activity using deep generative models, helping brain science researchers interpret how the brain represents real-world scenes. However, most current approaches leverage mapping brain signals into intermediate image or text feature spaces before guiding the generative process, masking the effect of contributions from different brain areas on the final reconstruction output. In this work, we propose NeuroAdapter, a visual decoding framework that directly conditions a latent diffusion model on brain representations, bypassing the need for intermediate feature spaces. Our method demonstrates competitive visual reconstruction quality on public fMRI datasets compared to prior work, while providing greater transparency into how brain signals shape the generation process. To this end, we contribute an Image-Brain BI-directional interpretability framework (IBBI) which investigates cross-attention mechanisms across diffusion denoising steps to reveal how different cortical areas influence the unfolding generative trajectory. Our results highlight the potential of end-to-end brain-to-image decoding and establish a path toward interpreting diffusion models through the lens of visual neuroscience.
Problem

Research questions and friction points this paper is trying to address.

Directly mapping brain signals to images without intermediate features
Providing transparency in how brain activity shapes image generation
Revealing cortical area influences on visual decoding process
Innovation

Methods, ideas, or system contributions that make the work stand out.

Directly conditions diffusion model on brain signals
Bypasses intermediate feature spaces for decoding
Uses cross-attention to interpret brain area contributions
🔎 Similar Papers
No similar papers found.
P
Pinyuan Feng
Zuckerman Mind Brain Behavior Institute, Columbia University, New York, USA
Hossein Adeli
Hossein Adeli
Columbia University
Visual CognitionSystem NeuroscienceAttention
W
Wenxuan Guo
Zuckerman Mind Brain Behavior Institute, Columbia University, New York, USA
Fan Cheng
Fan Cheng
Shanghai Jiao Tong University
Information Theory
E
Ethan Hwang
Zuckerman Mind Brain Behavior Institute, Columbia University, New York, USA
Nikolaus Kriegeskorte
Nikolaus Kriegeskorte
Professor of Psychology and Neuroscience, Columbia University
visionneural networksfMRIneuronal recordingspattern-information analysis