A generalizable 3D framework and model for self-supervised learning in medical imaging

📅 2025-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing self-supervised methods for 3D medical imaging suffer from simplistic preprocessing, modality- or organ-specific designs, and poor generalizability. To address this, we propose 3DINO—the first general-purpose self-supervised pretraining framework for 3D medical imaging—and the 3DINO-ViT model, trained on a large-scale, heterogeneous dataset of 100,000 cross-modal (CT/MRI) and multi-organ 3D scans. We innovatively adapt the DINO paradigm to 3D medical data by designing a dedicated 3D Vision Transformer architecture, introducing unified multimodal and multi-organ preprocessing and augmentation strategies, and employing contrastive feature consistency optimization. Extensive evaluations demonstrate state-of-the-art performance across segmentation and classification tasks, with particularly pronounced gains in low-data and out-of-distribution settings. To foster community advancement, we will open-source the model, thereby accelerating research on foundational 3D medical models and their downstream adaptation.

Technology Category

Application Category

📝 Abstract
Current self-supervised learning methods for 3D medical imaging rely on simple pretext formulations and organ- or modality-specific datasets, limiting their generalizability and scalability. We present 3DINO, a cutting-edge SSL method adapted to 3D datasets, and use it to pretrain 3DINO-ViT: a general-purpose medical imaging model, on an exceptionally large, multimodal, and multi-organ dataset of ~100,000 3D medical imaging scans from over 10 organs. We validate 3DINO-ViT using extensive experiments on numerous medical imaging segmentation and classification tasks. Our results demonstrate that 3DINO-ViT generalizes across modalities and organs, including out-of-distribution tasks and datasets, outperforming state-of-the-art methods on the majority of evaluation metrics and labeled dataset sizes. Our 3DINO framework and 3DINO-ViT will be made available to enable research on 3D foundation models or further finetuning for a wide range of medical imaging applications.
Problem

Research questions and friction points this paper is trying to address.

3D medical image
self-supervised learning
multi-scenario application
Innovation

Methods, ideas, or system contributions that make the work stand out.

3DINO
self-supervised learning
3D medical images
🔎 Similar Papers
No similar papers found.
Tony Xu
Tony Xu
University of Toronto
Computer VisionMedical ImagingDeep Learning
S
Sepehr Hosseini
Department of Computer Science, University of Toronto, Toronto, Ontario, Canada; Vector Institute, Toronto, Ontario, Canada; Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, Ontario, Canada
Chris Anderson
Chris Anderson
Institute for Aerospace Studies, University of Toronto, Toronto, Ontario, Canada
A
Anthony Rinaldi
Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada
Rahul G. Krishnan
Rahul G. Krishnan
University of Toronto
Machine LearningArtificial IntelligenceHealthcareProbabilistic modelsCausal Inference
A
Anne L. Martel
Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada; Physical Sciences Platform, Sunnybrook Research Institute, Toronto, Ontario, Canada
Maged Goubran
Maged Goubran
Canada Research Chair in AI and Computational Neuroscience, University of Toronto
Computational NeuroscienceArtificial IntelligenceNeuroimagingNeuromodulationConnectomics