Seeing the imagined: a latent functional alignment in visual imagery decoding from fMRI data

📅 2026-04-15
📈 Citations: 0
Influential: 0
📄 PDF

career value

227K/year
🤖 AI Summary
Current visual brain decoding methods suffer from markedly degraded performance and limited generalization in mental imagery tasks. To address this, this work proposes a latent-space functional alignment approach that maps imagery-evoked fMRI signals into the conditional latent space of a pretrained DynaDiff diffusion model, enabling high semantic-fidelity reconstruction while keeping the generative backbone frozen. Additionally, a semantic retrieval-augmentation strategy leveraging the NSD and Imagery-NSD datasets is introduced to mitigate the scarcity of imagery-perception paired data. Experiments demonstrate that the proposed method significantly outperforms both frozen baselines and voxel-space ridge regression alignment across four subjects, achieving above-chance imagery decoding in multiple cortical regions and improving high-level semantic reconstruction metrics.

Technology Category

Application Category

📝 Abstract
Recent progress in visual brain decoding from fMRI has been enabled by large-scale datasets such as the Natural Scenes Dataset (NSD) and powerful diffusion-based generative models. While current pipelines are primarily optimized for perception, their performance under mental-imagery remains less well understood. In this work, we study how a state-of-the-art (SOTA) perception decoder (DynaDiff) can be adapted to reconstruct imagined content from the Imagery-NSD benchmark. We propose a latent functional alignment approach that maps imagery-evoked activity into the pretrained model's conditioning space, while keeping the remaining components frozen. To mitigate the limited amount of matched imagery-perception supervision, we further introduce a retrieval-based augmentation strategy that selects semantically related NSD perception trials. Across four subjects, latent functional alignment consistently improves high-level semantic reconstruction metrics relative to the frozen pretrained baseline and a voxel-space ridge alignment baseline, and enables above-chance decoding from multiple cortical regions. These results suggest that semantic structure learned from perception can be leveraged to stabilize and improve visual imagery decoding under out-of-distribution conditions.
Problem

Research questions and friction points this paper is trying to address.

visual imagery decoding
fMRI
mental imagery
brain decoding
perception-to-imagery transfer
Innovation

Methods, ideas, or system contributions that make the work stand out.

latent functional alignment
visual imagery decoding
fMRI
diffusion-based generative models
retrieval-based augmentation