A Cognitive Process-Inspired Architecture for Subject-Agnostic Brain Visual Decoding

📅 2025-11-04

📈 Citations: 0

✨ Influential: 0

career value

242K/year

🤖 AI Summary

This study addresses the cross-subject fMRI-based visual decoding challenge—reconstructing continuous naturalistic visual experiences without subject-specific training. Key bottlenecks include the absence of explicit modeling of the ventral–dorsal visual stream functional hierarchy and poor generalization of semantic representations across subjects. To overcome these, we propose VCFlow: the first decoding framework that explicitly embeds a hierarchical ventral–dorsal stream architecture within the model, jointly capturing early visual cortex responses and multi-stream high-level cognitive features via feature disentanglement and feature-level contrastive learning. Experiments demonstrate that VCFlow achieves near-state-of-the-art reconstruction fidelity (only 7% accuracy drop), reduces single-video decoding time to 10 seconds, and enables zero-shot cross-subject transfer without fine-tuning. These advances significantly enhance clinical deployability and cross-subject generalization robustness.

Technology Category

Application Category

📝 Abstract

Subject-agnostic brain decoding, which aims to reconstruct continuous visual experiences from fMRI without subject-specific training, holds great potential for clinical applications. However, this direction remains underexplored due to challenges in cross-subject generalization and the complex nature of brain signals. In this work, we propose Visual Cortex Flow Architecture (VCFlow), a novel hierarchical decoding framework that explicitly models the ventral-dorsal architecture of the human visual system to learn multi-dimensional representations. By disentangling and leveraging features from early visual cortex, ventral, and dorsal streams, VCFlow captures diverse and complementary cognitive information essential for visual reconstruction. Furthermore, we introduce a feature-level contrastive learning strategy to enhance the extraction of subject-invariant semantic representations, thereby enhancing subject-agnostic applicability to previously unseen subjects. Unlike conventional pipelines that need more than 12 hours of per-subject data and heavy computation, VCFlow sacrifices only 7% accuracy on average yet generates each reconstructed video in 10 seconds without any retraining, offering a fast and clinically scalable solution. The source code will be released upon acceptance of the paper.

Problem

Research questions and friction points this paper is trying to address.

Reconstructing visual experiences from fMRI without subject-specific training data

Overcoming cross-subject generalization challenges in brain signal decoding

Reducing computational requirements for clinically scalable brain decoding solutions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Hierarchical framework modeling ventral-dorsal visual pathways

Feature-level contrastive learning for subject-invariant representations

Fast reconstruction without retraining using disentangled cognitive features

🔎 Similar Papers

Brain-aligning of semantic vectors improves neural decoding of visual stimuli

2024-03-22Citations: 0

Achieving more human brain-like vision via human EEG representational alignment

2024-01-30arXiv.orgCitations: 4

Toyota Research Institute

Los Altos, CA

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)