π€ AI Summary
To address the dual challenges of scarce annotated data and physical modeling inaccuracies in CT image synthesis, this paper proposes the Projection-domain Diffusion Model (PDM)βthe first diffusion-based framework that tightly integrates CT imaging physics priors with anatomical semantic text prompts for end-to-end, controllable cross-modal generation. Unlike conventional image-domain approaches, PDM operates directly in the raw projection domain to avoid error accumulation from intermediate reconstructions. It introduces a novel text-projection joint conditioning mechanism to enable structure-preserving and anatomy-consistent synthesis with explicit control. Experiments demonstrate that, under low-dose and sparse-view reconstruction settings, PDM achieves an average 2.1 dB PSNR improvement in downstream tasks using only few-shot training data. Moreover, the model exhibits strong multi-task generalization capability, establishing itself as a foundational generative model for CT. The implementation is publicly released to facilitate practical medical image data augmentation.
π Abstract
Synthesizing high quality CT images remains a signifi-cant challenge due to the limited availability of annotat-ed data and the complex nature of CT imaging. In this work, we present PRO, a novel framework that, to the best of our knowledge, is the first to perform CT image synthesis in the projection domain using latent diffusion models. Unlike previous approaches that operate in the image domain, PRO learns rich structural representa-tions from raw projection data and leverages anatomi-cal text prompts for controllable synthesis. This projec-tion domain strategy enables more faithful modeling of underlying imaging physics and anatomical structures. Moreover, PRO functions as a foundation model, capa-ble of generalizing across diverse downstream tasks by adjusting its generative behavior via prompt inputs. Experimental results demonstrated that incorporating our synthesized data significantly improves perfor-mance across multiple downstream tasks, including low-dose and sparse-view reconstruction, even with limited training data. These findings underscore the versatility and scalability of PRO in data generation for various CT applications. These results highlight the potential of projection domain synthesis as a powerful tool for data augmentation and robust CT imaging. Our source code is publicly available at: https://github.com/yqx7150/PRO.