🤖 AI Summary
Existing whole-body MR-to-CT synthesis methods suffer from substantial spatial misalignment and suboptimal image quality, limiting their clinical utility in PET/MR attenuation correction and MR-only radiotherapy planning. To address these challenges, we propose a 3D Wavelet Latent Diffusion Model (WLDM) that learns multi-scale MR–CT mapping within a compressed latent space. A wavelet residual module explicitly decouples structural anatomy from modality-specific features to ensure anatomical consistency, while a dual-path skip-attention mechanism jointly enhances bony detail and soft-tissue contrast. Quantitative and qualitative evaluations across multiple public benchmarks demonstrate significant improvements in spatial accuracy and image fidelity over prior art, achieving state-of-the-art performance. The proposed method delivers a clinically reliable solution for MR-only workflows.
📝 Abstract
Magnetic Resonance (MR) imaging plays an essential role in contemporary clinical diagnostics. It is increasingly integrated into advanced therapeutic workflows, such as hybrid Positron Emission Tomography/Magnetic Resonance (PET/MR) imaging and MR-only radiation therapy. These integrated approaches are critically dependent on accurate estimation of radiation attenuation, which is typically facilitated by synthesizing Computed Tomography (CT) images from MR scans to generate attenuation maps. However, existing MR-to-CT synthesis methods for whole-body imaging often suffer from poor spatial alignment between the generated CT and input MR images, and insufficient image quality for reliable use in downstream clinical tasks. In this paper, we present a novel 3D Wavelet Latent Diffusion Model (3D-WLDM) that addresses these limitations by performing modality translation in a learned latent space. By incorporating a Wavelet Residual Module into the encoder-decoder architecture, we enhance the capture and reconstruction of fine-scale features across image and latent spaces. To preserve anatomical integrity during the diffusion process, we disentangle structural and modality-specific characteristics and anchor the structural component to prevent warping. We also introduce a Dual Skip Connection Attention mechanism within the diffusion model, enabling the generation of high-resolution CT images with improved representation of bony structures and soft-tissue contrast.