🤖 AI Summary
This work addresses the challenges of small sample sizes, low-quality labels, and high dimensionality in fMRI data that often lead to model overfitting. To this end, the authors propose BrainSimSiam, a lightweight self-supervised representation learning framework. It innovatively employs a positive-pair-only Siamese network architecture combined with fMRI-specific data augmentation, feature disentanglement, and consistency constraints to learn task-agnostic, robust functional brain representations—without requiring large-scale pretraining or negative samples. Experimental results demonstrate that the learned representations significantly outperform fully supervised baselines across multiple downstream classification and regression tasks and approach the performance of large-scale pretrained models, thereby substantially reducing reliance on computational resources.
📝 Abstract
Functional magnetic resonance imaging (fMRI) is a powerful tool for investigating human brain function. However, the high cost of data acquisition and the inherent subjectivity of psychiatric rating scales often lead to datasets with small sample sizes and variable label quality, especially when targeting a specific neurological condition. Combined with the inherently high dimensionality of fMRI data, these limitations substantially increase the risk of model overfitting. Recent years have seen growing interest in developing fMRI foundation models by combining multiple datasets; however, the computational resources needed for pretraining and fine-tuning are often prohibitive. We show that a lightweight self-supervised framework yields representations that generalize across diverse downstream tasks, outperforming fully supervised baselines and approaching the performance of large-scale models. We introduce BrainSimSiam, a data-efficient self-supervised representation learning framework that leverages positive-only data pairs to learn robust and generalizable features. We demonstrate that the learned representations achieve strong performance across multiple downstream classification and regression tasks, highlighting the potential of BrainSimSiam for data-limited neuroimaging applications.