AURAD: Anatomy-Pathology Unified Radiology Synthesis with Progressive Representations

📅 2025-09-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Chest X-ray synthesis faces challenges including heterogeneous lesion morphology, tight anatomical–pathological coupling, scarce expert annotations, and domain shift, hindering fine-grained controllable generation. To address this, we propose an anatomy-guided progressive conditional generation framework: first, leveraging clinical prompts and a pre-trained medical foundation model to generate anatomy-aware pseudo-semantic pathology masks; then, jointly synthesizing high-fidelity X-ray images conditioned on these masks. A multi-expert filtering module is incorporated to enhance clinical plausibility. Our method is the first to enable explicit anatomical structure–guided controllable pathology mask generation, balancing visual realism and semantic utility. Radiologist evaluation indicates that 78% of synthesized X-rays exhibit clinical realism, and over 40% of pseudo-masks support reliable segmentation. Consequently, detection and segmentation models trained on our synthetic data demonstrate significantly improved generalization under data-scarce conditions.

Technology Category

Application Category

📝 Abstract
Medical image synthesis has become an essential strategy for augmenting datasets and improving model generalization in data-scarce clinical settings. However, fine-grained and controllable synthesis remains difficult due to limited high-quality annotations and domain shifts across datasets. Existing methods, often designed for natural images or well-defined tumors, struggle to generalize to chest radiographs, where disease patterns are morphologically diverse and tightly intertwined with anatomical structures. To address these challenges, we propose AURAD, a controllable radiology synthesis framework that jointly generates high-fidelity chest X-rays and pseudo semantic masks. Unlike prior approaches that rely on randomly sampled masks-limiting diversity, controllability, and clinical relevance-our method learns to generate masks that capture multi-pathology coexistence and anatomical-pathological consistency. It follows a progressive pipeline: pseudo masks are first generated from clinical prompts conditioned on anatomical structures, and then used to guide image synthesis. We also leverage pretrained expert medical models to filter outputs and ensure clinical plausibility. Beyond visual realism, the synthesized masks also serve as labels for downstream tasks such as detection and segmentation, bridging the gap between generative modeling and real-world clinical applications. Extensive experiments and blinded radiologist evaluations demonstrate the effectiveness and generalizability of our method across tasks and datasets. In particular, 78% of our synthesized images are classified as authentic by board-certified radiologists, and over 40% of predicted segmentation overlays are rated as clinically useful. All code, pre-trained models, and the synthesized dataset will be released upon publication.
Problem

Research questions and friction points this paper is trying to address.

Synthesizing fine-grained controllable chest X-rays with anatomical-pathological consistency
Addressing limited high-quality annotations and domain shifts in medical imaging
Generating clinically relevant radiology images with multi-pathology coexistence patterns
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates X-rays and masks from clinical prompts
Uses progressive pipeline for anatomical-pathological consistency
Leverages pretrained medical models for clinical plausibility
🔎 Similar Papers
No similar papers found.
S
Shuhan Ding
Duke-NUS Medical School
Jingjing Fu
Jingjing Fu
MS
image/video processing
Y
Yu Gu
Microsoft
Naiteek Sangani
Naiteek Sangani
Microsoft
Generative AI
M
Mu Wei
Microsoft
P
Paul Vozila
Microsoft
N
Nan Liu
Duke-NUS Medical School
J
Jiang Bian
Microsoft
Hoifung Poon
Hoifung Poon
General Manager, Microsoft Health Futures
precision healthreal-world evidencelarge language modelsmultimodal GenAI