🤖 AI Summary
This work addresses the challenge of poor diagnostic consistency in musculoskeletal MRI, which arises from anatomical complexity, coexisting pathologies, and scarce annotations. To this end, we propose the first unified foundation model based on diffusion models for musculoskeletal imaging. We perform self-supervised pretraining by training -3D diffusion models separately along sagittal, coronal, and axial views, then fuse multi-view representations to support both segmentation and multi-label diagnosis tasks. Our approach pioneers the application of diffusion models to multi-task modeling in musculoskeletal MRI, enabling cross-view anatomical feature integration, high-accuracy diagnosis under limited labeled data, and cross-joint transferability. The model achieves state-of-the-art performance in segmenting 11 knee structures and detecting 8 abnormalities, maintaining high accuracy with only 10% labeled data, and successfully generalizes to diagnosing 11 conditions in the ankle and shoulder joints.
📝 Abstract
Musculoskeletal disorders represent a significant global health burden and are a leading cause of disability worldwide. While MRI is essential for accurate diagnosis, its interpretation remains exceptionally challenging. Radiologists must identify multiple potential abnormalities within complex anatomical structures across different imaging planes, a process that requires significant expertise and is prone to variability. We developed OrthoDiffusion, a unified diffusion-based foundation model designed for multi-task musculoskeletal MRI interpretation. The framework utilizes three orientation-specific 3D diffusion models, pre-trained in a self-supervised manner on 15,948 unlabeled knee MRI scans, to learn robust anatomical features from sagittal, coronal, and axial views. These view-specific representations are integrated to support diverse clinical tasks, including anatomical segmentation and multi-label diagnosis. Our evaluation demonstrates that OrthoDiffusion achieves excellent performance in the segmentation of 11 knee structures and the detection of 8 knee abnormalities. The model exhibited remarkable robustness across different clinical centers and MRI field strengths, consistently outperforming traditional supervised models. Notably, in settings where labeled data was scarce, OrthoDiffusion maintained high diagnostic precision using only 10\% of training labels. Furthermore, the anatomical representations learned from knee imaging proved highly transferable to other joints, achieving strong diagnostic performance across 11 diseases of the ankle and shoulder. These findings suggest that diffusion-based foundation models can serve as a unified platform for multi-disease diagnosis and anatomical segmentation, potentially improving the efficiency and accuracy of musculoskeletal MRI interpretation in real-world clinical workflows.