Free Lunch in Medical Image Foundation Model Pre-training via Randomized Synthesis and Disentanglement

📅 2026-02-12
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
Medical image foundation models (MIFMs) have demonstrated remarkable potential for a wide range of clinical tasks, yet their development is constrained by the scarcity, heterogeneity, and high cost of large-scale annotated datasets. Here, we propose RaSD (Randomized Synthesis and Disentanglement), a scalable framework for pre-training MIFMs entirely on synthetic data. By modeling anatomical structures and appearance variations with randomized Gaussian distributions, RaSD exposes models to sufficient multi-scale structural and appearance perturbations, forcing them to rely on invariant and task-relevant anatomical cues rather than dataset-specific textures, thereby enabling robust and transferable representation learning. We pre-trained RaSD on 1.2 million 3D volumes and 9.6 million 2D images, and extensively evaluated the resulting models across 6 imaging modalities, 48 datasets, and 56 downstream tasks. Across all evaluated downstream tasks, RaSD consistently outperforms training-from-scratch models, achieves the best performance on 17 tasks, and remains comparable to models pre-trained on large real datasets in most others. These results demonstrate that the capacity of synthetic data alone to drive robust representation learning. Our findings establish a paradigm shift in medical AI, demonstrating that synthetic data can serve as a"free lunch"for scalable, privacy-preserving, and clinically generalizable foundation models.
Problem

Research questions and friction points this paper is trying to address.

Medical image foundation models
data scarcity
annotated datasets
heterogeneity
high cost
Innovation

Methods, ideas, or system contributions that make the work stand out.

synthetic data
foundation model
randomized synthesis
disentanglement
medical image pre-training
🔎 Similar Papers
No similar papers found.