🤖 AI Summary
This study investigates the influence of deep learning architectures’ inductive biases on few-shot fMRI time-series classification—specifically, distinguishing resting-state from movie-watching conditions. Within a unified experimental framework, we systematically compare CNNs, LSTMs, and Transformers for whole-brain and functional-network-level sex classification on the HCP 7T dataset. Results demonstrate that CNNs significantly outperform temporal models in both tasks, attributable to their superior capacity for local spatial modeling and regional dependency capture. Discriminative information is concentrated within five core functional networks—including the default mode and cingulo-opercular networks. To our knowledge, this is the first work to reveal the structural advantage of convolutional inductive bias in few-shot brain functional pattern recognition. Our findings provide an interpretable, empirically grounded rationale for architectural selection in neuroimaging-oriented model design.
📝 Abstract
Deep learning has advanced fMRI analysis, yet it remains unclear which architectural inductive biases are most effective at capturing functional patterns in human brain activity. This issue is particularly important in small-sample settings, as most datasets fall into this category. We compare models with three major inductive biases in deep learning including convolutional neural networks (CNNs), long short-term memory networks (LSTMs), and Transformers for the task of biological sex classification. These models are evaluated within a unified pipeline using parcellated multivariate fMRI time series from the Human Connectome Project (HCP) 7-Tesla cohort, which includes four resting-state runs and four movie-watching task runs. We assess performance on Whole-brain, subcortex, and 12 functional networks. CNNs consistently achieved the highest discrimination for sex classification in both resting-state and movie-watching, while LSTM and Transformer models underperformed. Network-resolved analyses indicated that the Whole-brain, Default Mode, Cingulo-Opercular, Dorsal Attention, and Frontoparietal networks were the most discriminative. These results were largely similar between resting-state and movie-watching. Our findings indicate that, at this dataset size, discriminative information is carried by local spatial patterns and inter-regional dependencies, favoring convolutional inductive bias. Our study provides insights for selecting deep learning architectures for fMRI time series classification.