๐ค AI Summary
This study addresses the limitations of conventional cone-beam computed tomography (CBCT)โnotably high radiation exposure and costโand the inability of single panoramic X-rays to yield geometrically consistent, high-fidelity 3D dental reconstructions. To overcome these challenges, the authors propose HiCT, a two-stage framework that first leverages a video diffusion model to synthesize geometrically consistent multi-view projections from a single panoramic X-ray, followed by high-fidelity CBCT reconstruction via a ray-based dynamic attention network integrated with an X-ray sampling strategy. This work pioneers the integration of video diffusion models with ray-wise dynamic attention mechanisms and introduces XCT, a large-scale paired dataset enabling robust training and validation. Experimental results demonstrate state-of-the-art performance on clinically relevant metrics, achieving accurate, geometrically consistent CBCT reconstructions with strong potential for clinical translation.
๐ Abstract
Accurate 3D dental imaging is vital for diagnosis and treatment planning, yet CBCT's high radiation dose and cost limit its accessibility. Reconstructing 3D volumes from a single low-dose panoramic X-ray is a promising alternative but remains challenging due to geometric inconsistencies and limited accuracy. We propose HiCT, a two-stage framework that first generates geometrically consistent multi-view projections from a single panoramic image using a video diffusion model, and then reconstructs high-fidelity CBCT from the projections using a ray-based dynamic attention network and an X-ray sampling strategy. To support this, we built XCT, a large-scale dataset combining public CBCT data with 500 paired PX-CBCT cases. Extensive experiments show that HiCT achieves state-of-the-art performance, delivering accurate and geometrically consistent reconstructions for clinical use.