SLIM-Brain: A Data- and Training-Efficient Foundation Model for fMRI Data Analysis

📅 2025-12-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
fMRI foundation models suffer from low data efficiency—requiring massive datasets—and poor training efficiency—due to memory explosion from voxel-wise modeling. To address these bottlenecks, we propose a graph-free, lightweight two-stage adaptive architecture. In the first stage, a lightweight temporal extractor identifies highly salient time windows; in the second stage, a 4D hierarchical JEPA encoder models only the top-k windows while masking 70% of voxels, preserving voxel-level spatial fidelity while achieving substantial computational compression. The model enables end-to-end, atlas-free pretraining and achieves state-of-the-art performance across seven public benchmarks using only 4,000 fMRI samples. It reduces GPU memory consumption to just 30% of conventional voxel-based methods. To our knowledge, this is the first fMRI foundation model to simultaneously achieve high data efficiency and high training efficiency.

Technology Category

Application Category

📝 Abstract
Foundation models are emerging as a powerful paradigm for fMRI analysis, but current approaches face a dual bottleneck of data- and training-efficiency. Atlas-based methods aggregate voxel signals into fixed regions of interest, reducing data dimensionality but discarding fine-grained spatial details, and requiring extremely large cohorts to train effectively as general-purpose foundation models. Atlas-free methods, on the other hand, operate directly on voxel-level information - preserving spatial fidelity but are prohibitively memory- and compute-intensive, making large-scale pre-training infeasible. We introduce SLIM-Brain (Sample-efficient, Low-memory fMRI Foundation Model for Human Brain), a new atlas-free foundation model that simultaneously improves both data- and training-efficiency. SLIM-Brain adopts a two-stage adaptive design: (i) a lightweight temporal extractor captures global context across full sequences and ranks data windows by saliency, and (ii) a 4D hierarchical encoder (Hiera-JEPA) learns fine-grained voxel-level representations only from the top-$k$ selected windows, while deleting about 70% masked patches. Extensive experiments across seven public benchmarks show that SLIM-Brain establishes new state-of-the-art performance on diverse tasks, while requiring only 4 thousand pre-training sessions and approximately 30% of GPU memory comparing to traditional voxel-level methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses data- and training-efficiency bottlenecks in fMRI foundation models
Resolves atlas-based loss of spatial details and atlas-free computational intensity
Enables efficient voxel-level analysis with reduced memory and pre-training data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight temporal extractor selects salient data windows
4D hierarchical encoder learns from top-k windows efficiently
Deletes 70% masked patches to reduce memory usage
🔎 Similar Papers
No similar papers found.
Mo Wang
Mo Wang
University of Florida, University of Maryland, Portland State University
retirementworkoccupational health psychologyexpatriate managementresearch methods
Junfeng Xia
Junfeng Xia
Anhui University
Bioinformatics
W
Wenhao Ye
Department of Biomedical Engineering, Southern University of Science and Technology, China
E
Enyu Liu
Department of Biomedical Engineering, Southern University of Science and Technology, China
K
Kaining Peng
Department of Biomedical Engineering, Southern University of Science and Technology, China
J
Jianfeng Feng
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, China
Q
Quanying Liu
Department of Biomedical Engineering, Southern University of Science and Technology, China
Hongkai Wen
Hongkai Wen
University of Warwick
Machine LearningML/AI SystemsCyber-Physical Systems