The role of self-supervised pretraining in differentially private medical image analysis

📅 2026-01-27
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study addresses the significant drop in diagnostic performance often caused by differential privacy (DP) in medical imaging. It presents the first systematic evaluation, under full-model DP settings, of how general supervised (ImageNet), general self-supervised (DINOv3), and domain-specific supervised (MIMIC-CXR) pretraining affect chest X-ray classification. Using DP-SGD to train ConvNeXt models on over 800,000 chest radiographs and five external datasets, the experiments demonstrate that DINOv3 initialization outperforms ImageNet, while domain-specific supervised pretraining achieves performance closest to the non-private baseline. Moreover, it substantially enhances cross-institutional generalization, demographic fairness, and robustness to variations in data scale and model capacity, underscoring the critical role of initialization strategies in balancing privacy and utility.

Technology Category

Application Category

📝 Abstract
Differential privacy (DP) provides formal protection for sensitive data but typically incurs substantial losses in diagnostic performance. Model initialization has emerged as a critical factor in mitigating this degradation, yet the role of modern self-supervised learning under full-model DP remains poorly understood. Here, we present a large-scale evaluation of initialization strategies for differentially private medical image analysis, using chest radiograph classification as a representative benchmark with more than 800,000 images. Using state-of-the-art ConvNeXt models trained with DP-SGD across realistic privacy regimes, we compare non-domain-specific supervised ImageNet initialization, non-domain-specific self-supervised DINOv3 initialization, and domain-specific supervised pretraining on MIMIC-CXR, the largest publicly available chest radiograph dataset. Evaluations are conducted across five external datasets spanning diverse institutions and acquisition settings. We show that DINOv3 initialization consistently improves diagnostic utility relative to ImageNet initialization under DP, but remains inferior to domain-specific supervised pretraining, which achieves performance closest to non-private baselines. We further demonstrate that initialization choice strongly influences demographic fairness, cross-dataset generalization, and robustness to data scale and model capacity under privacy constraints. The results establish initialization strategy as a central determinant of utility, fairness, and generalization in differentially private medical imaging.
Problem

Research questions and friction points this paper is trying to address.

differential privacy
medical image analysis
self-supervised pretraining
model initialization
diagnostic performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

differential privacy
self-supervised learning
medical image analysis
model initialization
fairness
🔎 Similar Papers
No similar papers found.
Soroosh Tayebi Arasteh
Soroosh Tayebi Arasteh
RWTH Aachen University
Deep LearningAI in MedicineGenerative AIMedical Image Analysis
M
Mina Farajiamiri
Lab for AI in Medicine, RWTH Aachen University, Aachen, Germany; School of Business and Economics, RWTH Aachen University, Aachen, Germany
Mahshad Lotfinia
Mahshad Lotfinia
RWTH Aachen University
Artificial IntelligenceDeep LearningMedical Image Analysis
B
B. Hinrichs-Puladi
Department of Oral and Maxillofacial Surgery, University Hospital RWTH Aachen, Aachen, Germany; Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
J
Jonas Bienzeisler
Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
M
Mohamed Alhaskir
Institute of Medical Informatics, University Hospital RWTH Aachen, Aachen, Germany
Mirabela Rusu
Mirabela Rusu
Assistant Professor of Radiology at Stanford University
multi-protocolmulti-scale data fusionMRIHistologycomputational imaging
C
Christiane Kuhl
Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
S
S. Nebelung
Lab for AI in Medicine, RWTH Aachen University, Aachen, Germany; Department of Diagnostic and Interventional Radiology, University Hospital RWTH Aachen, Aachen, Germany
Daniel Truhn
Daniel Truhn
Professor of Radiology, University Hospital Aachen
Machine LearningArtificial IntelligenceComputer VisionMedical Imaging