🤖 AI Summary
This work addresses the degradation in image quality of low-dose PET scans caused by reduced radiation exposure or shortened acquisition time, as well as the challenges of structural-textural inconsistency and domain shift in multimodal fusion. To tackle these issues, the authors propose MFdiff, a supervised auxiliary multimodal fusion diffusion model. MFdiff incorporates a dedicated multimodal feature fusion module and employs a two-stage supervised learning strategy: it first pretrains using simulated in-domain priors and then fine-tunes with real-world out-of-distribution data to capture task-specific priors, thereby mitigating interference from redundant anatomical information. Experimental results demonstrate that MFdiff consistently outperforms existing methods in both qualitative and quantitative evaluations, achieving substantial improvements in reconstructing standard-dose PET images from low-dose inputs.
📝 Abstract
Positron emission tomography (PET) offers powerful functional imaging but involves radiation exposure. Efforts to reduce this exposure by lowering the radiotracer dose or scan time can degrade image quality. While using magnetic resonance (MR) images with clearer anatomical information to restore standard-dose PET (SPET) from low-dose PET (LPET) is a promising approach, it faces challenges with the inconsistencies in the structure and texture of multi-modality fusion, as well as the mismatch in out-of-distribution (OOD) data. In this paper, we propose a supervise-assisted multi-modality fusion diffusion model (MFdiff) for addressing these challenges for high-quality PET restoration. Firstly, to fully utilize auxiliary MR images without introducing extraneous details in the restored image, a multi-modality feature fusion module is designed to learn an optimized fusion feature. Secondly, using the fusion feature as an additional condition, high-quality SPET images are iteratively generated based on the diffusion model. Furthermore, we introduce a two-stage supervise-assisted learning strategy that harnesses both generalized priors from simulated in-distribution datasets and specific priors tailored to in-vivo OOD data. Experiments demonstrate that the proposed MFdiff effectively restores high-quality SPET images from multi-modality inputs and outperforms state-of-the-art methods both qualitatively and quantitatively.