🤖 AI Summary
This work addresses the fundamental trade-off among inference efficiency, individual fidelity, and distributional plausibility in high-dimensional 3D medical image generation. The authors propose the GDM framework, which extends generative drift mechanisms to 3D medical imaging for the first time. By leveraging an attraction–repulsion drift field, GDM jointly optimizes distributional realism and patient-specific fidelity within a single-step inference process. Built upon a medical foundation encoder, the method constructs a multi-level feature bank and introduces a gradient coordination strategy in a shared output space to enable synergistic modeling of global, local, and spatial representations. Evaluated on MRI-to-CT synthesis and sparse-view CT reconstruction tasks, GDM substantially outperforms existing generative models—including GANs, flow matching, and SDE-based approaches—as well as supervised regression baselines, achieving a superior balance among anatomical accuracy, quantitative reliability, perceptual realism, and computational efficiency.
📝 Abstract
Conditional medical image generation plays an important role in many clinically relevant imaging tasks. However, existing methods still face a fundamental challenge in balancing inference efficiency, patient-specific fidelity, and distribution-level plausibility, particularly in high-dimensional 3D medical imaging. In this work, we propose GDM, a generative drifting framework that reformulates deterministic medical image prediction as a multi-objective learning problem to jointly promote distribution-level plausibility and patient-specific fidelity while retaining one-step inference. GDM extends drifting to 3D medical imaging through an attractive-repulsive drift that minimizes the discrepancy between the generator pushforward and the target distribution. To enable stable drifting-based learning in 3D volumetric data, GDM constructs a multi-level feature bank from a medical foundation encoder to support reliable affinity estimation and drifting field computation across complementary global, local, and spatial representations. In addition, a gradient coordination strategy in the shared output space improves optimization balance under competing distribution-level and fidelity-oriented objectives. We evaluate the proposed framework on two representative tasks, MRI-to-CT synthesis and sparse-view CT reconstruction. Experimental results show that GDM consistently outperforms a wide range of baselines, including GAN-based, flow-matching-based, and SDE-based generative models, as well as supervised regression methods, while improving the balance among anatomical fidelity, quantitative reliability, perceptual realism, and inference efficiency. These findings suggest that GDM provides a practical and effective framework for conditional 3D medical image generation.