Towards Generalisable Foundation Models for 3D Brain MRI

📅 2025-10-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Long-standing limitations in 3D brain MRI modeling—including reliance on single-slice paradigms, poor generalizability, and heavy dependence on expert annotations—are addressed by BrainFound, the first self-supervised foundation model designed specifically for whole-brain 3D MRI. Methodologically, we extend the DINO-v2 framework with multi-scale 3D patch embedding and voxel-level sequence modeling, enabling robust representation learning from single- or multi-modal inputs (e.g., T1, T2, FLAIR). Crucially, this work pioneers the effective adaptation of Vision Transformers to anatomically grounded 3D medical imaging. Evaluated across multi-scanner, multi-center, and cross-protocol settings, BrainFound consistently outperforms both state-of-the-art self-supervised and supervised methods. It delivers significant gains in downstream clinical tasks—including disease detection and semantic segmentation—improving diagnostic accuracy while drastically reducing annotation burden. The model demonstrates strong practical utility for both clinical deployment and biomedical research.

Technology Category

Application Category

📝 Abstract
Foundation models in artificial intelligence (AI) are transforming medical imaging by enabling general-purpose feature learning from large-scale, unlabeled datasets. In this work, we introduce BrainFound, a self-supervised foundation model for brain MRI, built by extending DINO-v2, a vision transformer originally designed for 2D natural images. BrainFound adapts DINO-v2 to model full 3D brain anatomy by incorporating volumetric information from sequential MRI slices, moving beyond conventional single-slice paradigms. It supports both single- and multimodal inputs, enabling a broad range of downstream tasks, including disease detection and image segmentation, while generalising across varied imaging protocols and clinical scenarios. We show that BrainFound consistently outperforms existing self-supervised pretraining strategies and supervised baselines, particularly in label-scarce and multi-contrast settings. By integrating information from diverse 3D MRI modalities (e.g., T1, T2, FLAIR), it enhances diagnostic accuracy and reduces dependency on extensive expert annotations. This flexibility makes BrainFound a scalable and practical solution for 3D neuroimaging pipelines, with significant potential for clinical deployment and research innovation.
Problem

Research questions and friction points this paper is trying to address.

Adapting 2D vision transformers to 3D brain MRI anatomy
Enabling generalizable medical imaging across protocols and tasks
Reducing dependency on expert annotations for diagnosis
Innovation

Methods, ideas, or system contributions that make the work stand out.

Extends DINO-v2 vision transformer for 3D brain MRI
Incorporates volumetric data from sequential MRI slices
Supports single- and multimodal inputs for diverse tasks
🔎 Similar Papers
No similar papers found.