🤖 AI Summary
This study addresses the challenge of decoding how the brain encodes visual concepts by constructing cross-modal shared neural representations. Leveraging the complementary temporal/spatial resolution and neurophysiological sensitivity of EEG, MEG, and fMRI, we propose the first synergistic modeling framework integrating a multimodal large language model (MLLM) with modality-specific adapters, jointly optimizing cross-modal alignment and task-oriented decoding. Our method enables end-to-end joint representation learning across all three neuroimaging modalities—marking the first such achievement—and achieves state-of-the-art performance on multi-subject visual retrieval tasks. We demonstrate that visual concepts exhibit consistent, structured semantic organization across modalities and empirically validate their implicit mapping to physical stimuli. The framework exhibits strong generalizability and inherent extensibility to diverse neuroimaging tasks.
📝 Abstract
Understanding how the brain represents visual information is a fundamental challenge in neuroscience and artificial intelligence. While AI-driven decoding of neural data has provided insights into the human visual system, integrating multimodal neuroimaging signals, such as EEG, MEG, and fMRI, remains a critical hurdle due to their inherent spatiotemporal misalignment. Current approaches often analyze these modalities in isolation, limiting a holistic view of neural representation. In this study, we introduce BrainFLORA, a unified framework for integrating cross-modal neuroimaging data to construct a shared neural representation. Our approach leverages multimodal large language models (MLLMs) augmented with modality-specific adapters and task decoders, achieving state-of-the-art performance in joint-subject visual retrieval task and has the potential to extend multitasking. Combining neuroimaging analysis methods, we further reveal how visual concept representations align across neural modalities and with real world object perception. We demonstrate that the brain's structured visual concept representations exhibit an implicit mapping to physical-world stimuli, bridging neuroscience and machine learning from different modalities of neural imaging. Beyond methodological advancements, BrainFLORA offers novel implications for cognitive neuroscience and brain-computer interfaces (BCIs). Our code is available at https://github.com/ncclab-sustech/BrainFLORA.