🤖 AI Summary
Medical AI faces dual challenges of data scarcity and privacy preservation, while existing generative models exhibit poor generalizability across medical specialties and imaging modalities. To address this, we propose the first universal text-guided medical image synthesis framework supporting six clinical specialties and ten imaging modalities. Our method employs a latent-space text-conditional diffusion model trained jointly on heterogeneous, multi-source medical imaging datasets. We further introduce a clinically grounded prompt engineering strategy and a dedicated evaluation protocol emphasizing semantic alignment with radiological and pathological concepts. Expert validation by radiologists and pathologists confirms high semantic fidelity of the synthesized images. In downstream classification tasks, models trained exclusively on synthetic data achieve performance comparable to baselines trained on twice the volume of real data. This work substantially advances the generality, clinical credibility, and privacy compliance of medical image generation.
📝 Abstract
Deep learning algorithms require extensive data to achieve robust performance. However, data availability is often restricted in the medical domain due to patient privacy concerns. Synthetic data presents a possible solution to these challenges. Recently, image generative models have found increasing use for medical applications but are often designed for singular medical specialties and imaging modalities, thus limiting their broader utility. To address this, we introduce MediSyn: a text-guided, latent diffusion model capable of generating synthetic images from 6 medical specialties and 10 image types. The synthetic images are validated by expert clinicians for alignment with their corresponding text prompts. Furthermore, a direct comparison of the synthetic images against the real images confirms that our model synthesizes novel images and, crucially, may preserve patient privacy. Finally, classifiers trained on a mixture of synthetic and real data achieve similar performance to those trained on twice the amount of real data. Our findings highlight the immense potential for generalist image generative models to accelerate algorithmic research and development in medicine.