🤖 AI Summary
Existing fMRI-to-video reconstruction methods are hindered by incomplete semantic embeddings, struggling to bridge the semantic gap between neural signals and dynamic visual content. Inspired by the human brain’s dual-stream processing mechanism, this work proposes CineNeuron, a hierarchical framework that synergistically optimizes reconstruction through bottom-up multidimensional semantic enhancement—integrating textual, visual, action, and object-category cues—and a top-down Mixture-of-Memories mechanism. Employing a hierarchical neural architecture, the proposed method substantially outperforms state-of-the-art models on two benchmark datasets, achieving leading performance across multiple evaluation metrics and enabling high-fidelity video reconstruction from brain activity.
📝 Abstract
Reconstructing dynamic visual experiences as videos from functional magnetic resonance imaging (fMRI) is pivotal for advancing the understanding of neural processes. However, current fMRI-to-video reconstruction methods are hindered by a semantic gap between noisy fMRI signals and the rich content of videos, stemming from a reliance on incomplete semantic embeddings that neither capture video-specific cues (e.g., actions) nor integrate prior knowledge. To this end, we draw inspiration from the dual-pathway processing mechanism in human brain and introduce CineNeuron, a novel hierarchical framework for semantically enhanced video reconstruction from fMRI signals with two synergistic stages. First, a bottom-up semantic enrichment stage maps fMRI signals to a rich embedding space that comprehensively captures textual semantics, image contents, action concepts, and object categories. Second, a top-down memory integration stage utilizes the proposed Mixture-of-Memories method to dynamically select relevant "memories" from previously seen data and fuse them with the fMRI embedding to refine the video reconstruction. Extensive experimental results on two fMRI-to-video benchmarks demonstrate that CineNeuron surpasses state-of-the-art methods across various metrics.