🤖 AI Summary
Long-standing limitations in 3D brain MRI modeling—including reliance on single-slice paradigms, poor generalizability, and heavy dependence on expert annotations—are addressed by BrainFound, the first self-supervised foundation model designed specifically for whole-brain 3D MRI. Methodologically, we extend the DINO-v2 framework with multi-scale 3D patch embedding and voxel-level sequence modeling, enabling robust representation learning from single- or multi-modal inputs (e.g., T1, T2, FLAIR). Crucially, this work pioneers the effective adaptation of Vision Transformers to anatomically grounded 3D medical imaging. Evaluated across multi-scanner, multi-center, and cross-protocol settings, BrainFound consistently outperforms both state-of-the-art self-supervised and supervised methods. It delivers significant gains in downstream clinical tasks—including disease detection and semantic segmentation—improving diagnostic accuracy while drastically reducing annotation burden. The model demonstrates strong practical utility for both clinical deployment and biomedical research.
📝 Abstract
Foundation models in artificial intelligence (AI) are transforming medical imaging by enabling general-purpose feature learning from large-scale, unlabeled datasets. In this work, we introduce BrainFound, a self-supervised foundation model for brain MRI, built by extending DINO-v2, a vision transformer originally designed for 2D natural images. BrainFound adapts DINO-v2 to model full 3D brain anatomy by incorporating volumetric information from sequential MRI slices, moving beyond conventional single-slice paradigms. It supports both single- and multimodal inputs, enabling a broad range of downstream tasks, including disease detection and image segmentation, while generalising across varied imaging protocols and clinical scenarios. We show that BrainFound consistently outperforms existing self-supervised pretraining strategies and supervised baselines, particularly in label-scarce and multi-contrast settings. By integrating information from diverse 3D MRI modalities (e.g., T1, T2, FLAIR), it enhances diagnostic accuracy and reduces dependency on extensive expert annotations. This flexibility makes BrainFound a scalable and practical solution for 3D neuroimaging pipelines, with significant potential for clinical deployment and research innovation.