🤖 AI Summary
MRI motion artifacts severely degrade image quality and quantitative analysis accuracy, while existing correction methods suffer from high computational cost and complex pipelines. To address this, we propose Res-MoCoDiff—a novel end-to-end motion correction diffusion model. It introduces a residual error offset mechanism during forward diffusion to precisely align the noise distribution with motion-degraded data; designs a four-step efficient reverse diffusion process to accelerate sampling; and integrates Swin-Transformer into a U-Net architecture to enhance multi-scale feature representation. Trained on realistic motion-simulated data using an ℓ₁ + ℓ₂ hybrid loss and validated in vivo, Res-MoCoDiff achieves state-of-the-art performance: SSIM = 0.932 ± 0.014, NMSE = 0.021 ± 0.005, and PSNR = 41.91 ± 2.94 dB. It corrects two slices per batch in just 0.37 seconds—275× faster than conventional approaches.
📝 Abstract
Purpose: Motion artifacts in magnetic resonance imaging (MRI) significantly degrade image quality and impair quantitative analysis. Conventional mitigation strategies, such as repeated acquisitions or motion tracking, are costly and workflow-intensive. This study introduces Res-MoCoDiff, an efficient denoising diffusion probabilistic model tailored for MRI motion artifact correction. Methods: Res-MoCoDiff incorporates a novel residual error shifting mechanism in the forward diffusion process, aligning the noise distribution with motion-corrupted data and enabling an efficient four-step reverse diffusion. A U-net backbone enhanced with Swin-Transformer blocks conventional attention layers, improving adaptability across resolutions. Training employs a combined l1+l2 loss, which promotes image sharpness and reduces pixel-level errors. Res-MoCoDiff was evaluated on synthetic dataset generated using a realistic motion simulation framework and on an in-vivo dataset. Comparative analyses were conducted against established methods, including CycleGAN, Pix2pix, and MT-DDPM using quantitative metrics such as peak signal-to-noise ratio (PSNR), structural similarity index measure (SSIM), and normalized mean squared error (NMSE). Results: The proposed method demonstrated superior performance in removing motion artifacts across all motion severity levels. Res-MoCoDiff consistently achieved the highest SSIM and the lowest NMSE values, with a PSNR of up to 41.91+-2.94 dB for minor distortions. Notably, the average sampling time was reduced to 0.37 seconds per batch of two image slices, compared with 101.74 seconds for conventional approaches.