🤖 AI Summary
This work addresses the challenges of sparse-view CT reconstruction—namely, the high computational cost of 3D diffusion models, limited training data, and inter-slice inconsistencies—while aiming to reduce radiation dose and scan time. The authors propose a Conditional Diffusion Posterior Alignment (CDPA) framework that uniquely integrates conditional diffusion mechanisms with explicit data consistency constraints. By leveraging an initial 3D reconstruction to guide the 2D U-Net diffusion process, CDPA enforces fidelity to projection measurements while enhancing inter-slice coherence. The method achieves a favorable balance between reconstruction quality and 3D scalability, substantially reducing computational overhead. Experiments on both synthetic and real cone-beam CT (CBCT) datasets demonstrate state-of-the-art performance, attaining reconstruction quality comparable to full 3D diffusion models while enabling efficient and rapid inference.
📝 Abstract
Computed Tomography (CT) is a widely used imaging modality in medical and industrial applications. To limit radiation exposure and measurement time, there is a growing interest in sparse-view CT, where the number of projection views is significantly reduced. Deep neural networks have shown great promise in improving reconstruction quality in sparse-view CT, especially generative diffusion models. However, these methods struggle to scale to large 3D volumes due to several reasons: (i) the high memory and computational requirements of 3D models, (ii) the lack of large 3D training datasets, and (iii) the inconsistencies across slices when using 2D models independently on each slice. We overcome these limitations and scale diffusion-based sparse-view CT reconstruction to large 3D volumes by combining conditional diffusion with explicit data consistency. We propose Conditional Diffusion Posterior Alignment (CDPA) to enable scalable 3D sparse-view CT reconstruction. A 2D U-Net diffusion model is conditioned on an initial 3D reconstruction to improve inter-slice consistency, combined with data-consistency alignment to match measured projections. Experiments on synthetic and real Cone Beam CT (CBCT) data show state-of-the-art performance, with ablations that confirm the synergistic effects of the proposed pipeline. Finally, we show that the same principles also strengthen fast denoising U-Nets, yielding near-diffusion quality at a fraction of the computational cost.