DVG-Diffusion: Dual-View Guided Diffusion Model for CT Reconstruction from X-Rays

๐Ÿ“… 2025-03-22
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
To address the challenging end-to-end reconstruction of 3D CT volumes from sparse-view 2D X-ray images, this paper proposes a dual-view collaborative guidance framework based on latent diffusion. Methodologically, we design a view-parameter-guided feature encoder to align features between X-ray and CT spaces; introduce a novel view synthesis module to generate auxiliary X-ray projections, forming a hybrid conditioning input comprising both real and synthetic views; and jointly optimize the diffusion process via feature-level alignment and CT latent representation decoding. Our key contributions are the first introduction of a dual-view collaborative guidance paradigm and a view-parameter-driven cross-modal feature alignment strategy. Extensive experiments on multiple benchmarks demonstrate significant improvements over state-of-the-art methods in both structural fidelity and perceptual quality, validating the effectiveness and robustness of our approach under sparse-view settings.

Technology Category

Application Category

๐Ÿ“ Abstract
Directly reconstructing 3D CT volume from few-view 2D X-rays using an end-to-end deep learning network is a challenging task, as X-ray images are merely projection views of the 3D CT volume. In this work, we facilitate complex 2D X-ray image to 3D CT mapping by incorporating new view synthesis, and reduce the learning difficulty through view-guided feature alignment. Specifically, we propose a dual-view guided diffusion model (DVG-Diffusion), which couples a real input X-ray view and a synthesized new X-ray view to jointly guide CT reconstruction. First, a novel view parameter-guided encoder captures features from X-rays that are spatially aligned with CT. Next, we concatenate the extracted dual-view features as conditions for the latent diffusion model to learn and refine the CT latent representation. Finally, the CT latent representation is decoded into a CT volume in pixel space. By incorporating view parameter guided encoding and dual-view guided CT reconstruction, our DVG-Diffusion can achieve an effective balance between high fidelity and perceptual quality for CT reconstruction. Experimental results demonstrate our method outperforms state-of-the-art methods. Based on experiments, the comprehensive analysis and discussions for views and reconstruction are also presented.
Problem

Research questions and friction points this paper is trying to address.

Direct 3D CT reconstruction from few 2D X-rays
Enhancing 2D-to-3D mapping via dual-view synthesis
Balancing fidelity and quality in CT reconstruction
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-view guided diffusion model for CT reconstruction
View parameter-guided encoder aligns X-ray features
Latent diffusion model refines CT representation
๐Ÿ”Ž Similar Papers
No similar papers found.
X
Xing Xie
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; University of Chinese Academy of Sciences, Beijing, 100049, China
J
Jiawei Liu
State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, 110016, China; University of Chinese Academy of Sciences, Beijing, 100049, China
Huijie Fan
Huijie Fan
Shenyang Institute of Automation, Chinese Academy of Sciences
Zhi Han
Zhi Han
SIA, CAS
Computer Vision
Yandong Tang
Yandong Tang
ไธญๅ›ฝ็ง‘ๅญฆ้™ขๆฒˆ้˜ณ่‡ชๅŠจๅŒ–็ ”็ฉถๆ‰€ๆ•™ๆŽˆ
่ฎก็ฎ—ๆœบ่ง†่ง‰ใ€ๅ›พๅƒๅค„็†ใ€ๆจกๅผ่ฏ†ๅˆซ
Liangqiong Qu
Liangqiong Qu
The University of Hong Kong
Medical Image AnalysisImage SynthesisIllumination ModelingFederated Learning