🤖 AI Summary
This work addresses the challenge of reconstructing visual images from fMRI signals under cross-subject, zero-shot conditions, where inter-individual variability in neural responses disrupts one-to-one mapping. To tackle this, the authors propose PictorialCortex, a model built upon a newly curated unified cortical surface dataset, UniCortex-fMRI. By operating in a shared cortical latent space, PictorialCortex employs a compositional modeling strategy augmented with a factorization–recomposition consistency regularization mechanism to effectively disentangle subject-specific, dataset-specific, and trial-specific variations. Through cortical surface normalization, joint training across multiple datasets, and integration with a diffusion-based generative model, the method achieves high-quality zero-shot cross-subject visual reconstruction for the first time, significantly outperforming existing approaches and demonstrating the efficacy of compositional latent modeling in neural decoding.
📝 Abstract
Decoding visual experiences from human brain activity remains a central challenge at the intersection of neuroscience, neuroimaging, and artificial intelligence. A critical obstacle is the inherent variability of cortical responses: neural activity elicited by the same visual stimulus differs across individuals and trials due to anatomical, functional, cognitive, and experimental factors, making fMRI-to-image reconstruction non-injective. In this paper, we tackle a challenging yet practically meaningful problem: zero-shot cross-subject fMRI-to-image reconstruction, where the visual experience of a previously unseen individual must be reconstructed without subject-specific training. To enable principled evaluation, we present a unified cortical-surface dataset -- UniCortex-fMRI, assembled from multiple visual-stimulus fMRI datasets to provide broad coverage of subjects and stimuli. Our UniCortex-fMRI is particularly processed by standardized data formats to make it possible to explore this possibility in the zero-shot scenario of cross-subject fMRI-to-image reconstruction. To tackle the modeling challenge, we propose PictorialCortex, which models fMRI activity using a compositional latent formulation that structures stimulus-driven representations under subject-, dataset-, and trial-related variability. PictorialCortex operates in a universal cortical latent space and implements this formulation through a latent factorization-composition module, reinforced by paired factorization and re-factorizing consistency regularization. During inference, surrogate latents synthesized under multiple seen-subject conditions are aggregated to guide diffusion-based image synthesis for unseen subjects. Extensive experiments show that PictorialCortex improves zero-shot cross-subject visual reconstruction, highlighting the benefits of compositional latent modeling and multi-dataset training.