Self-Supervised Learning for Knee Osteoarthritis: Diagnostic Limitations and Prognostic Value of Uncurated Hospital Data

📅 2026-03-25

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

This study systematically evaluates the effectiveness of self-supervised learning for knee osteoarthritis diagnosis and prognosis modeling, with particular attention to how uncurated hospital imaging data—characterized by a severe KL-grade imbalance (93% KL grade 3)—differentially impacts these two tasks. Employing both image-only and multimodal (image–text) self-supervised pretraining, combined with linear probing and full fine-tuning strategies, models were trained and externally validated across the OAI, MOST, and NYU cohorts. Results reveal that while the skewed data distribution limits diagnostic performance for KL grading—failing to surpass ImageNet-supervised baselines—it substantially enhances structural progression prediction: the multimodal approach improves 4-year progression AUROC on the MOST cohort from 0.599 to 0.701 using only 10% labeled data. This work is the first to demonstrate a fundamental divergence in pretraining data distribution requirements between diagnostic and prognostic tasks.

Technology Category

Application Category

📝 Abstract

This study assesses whether self-supervised learning (SSL) improves knee osteoarthritis (OA) modeling for diagnosis and prognosis relative to ImageNet-pretrained initialization. We compared (i) image-only SSL pretrained on knee radiographs from the OAI, MOST, and NYU cohorts, and (ii) multimodal image-text SSL pretrained on uncurated hospital knee radiographs paired with radiologist impressions. For diagnostic Kellgren-Lawrence (KL) grade prediction, SSL offered mixed results. While image-only SSL improved accuracy during linear probing (frozen encoder), it did not outperform ImageNet pretraining during full fine-tuning. Similarly, multimodal SSL failed to improve grading performance. We attribute this to severe bias in the uncurated hospital pretraining corpus (93% estimated KL grade 3), which limited alignment with the balanced diagnostic task. In contrast, this same multimodal initialization significantly improved prognostic modeling. It outperformed ImageNet baselines in predicting 4-year structural incidence and progression, including on external validation (MOST AUROC: 0.701 vs. 0.599 at 10% labeled data). Overall, while uncurated hospital image-text data may be ineffective for learning diagnosis due to severity bias, it provides a strong signal for prognostic modeling when the downstream task aligns with pretraining data distribution

Problem

Research questions and friction points this paper is trying to address.

knee osteoarthritis

self-supervised learning

diagnostic bias

prognostic modeling

uncurated hospital data

Innovation

Methods, ideas, or system contributions that make the work stand out.

self-supervised learning

knee osteoarthritis

prognostic modeling