🤖 AI Summary
To address the low computational efficiency and insufficient reliability of deep learning methods in medical image restoration, this paper proposes LRformer, a lightweight frequency-domain Transformer. Methodologically, it introduces a novel reliability-guided frequency-domain learning paradigm; designs a Reliable Lesion Semantic Prior Generator (RLPP) based on Monte Carlo sampling; and proposes a Frequency-domain Guided Cross-Attention (GFCA) mechanism leveraging the conjugate symmetry property of the Fast Fourier Transform (FFT), reducing computational complexity by nearly 50%. Evaluated on low-dose CT denoising, MRI super-resolution, and artifact removal, LRformer consistently outperforms state-of-the-art methods—achieving higher PSNR and SSIM, 38% fewer parameters, and 47% lower FLOPs—while ensuring clinical safety and real-time applicability via Bayesian uncertainty quantification.
📝 Abstract
Medical image restoration tasks aim to recover high-quality images from degraded observations, exhibiting emergent desires in many clinical scenarios, such as low-dose CT image denoising, MRI super-resolution, and MRI artifact removal. Despite the success achieved by existing deep learning-based restoration methods with sophisticated modules, they struggle with rendering computationally-efficient reconstruction results. Moreover, they usually ignore the reliability of the restoration results, which is much more urgent in medical systems. To alleviate these issues, we present LRformer, a Lightweight Transformer-based method via Reliability-guided learning in the frequency domain. Specifically, inspired by the uncertainty quantification in Bayesian neural networks (BNNs), we develop a Reliable Lesion-Semantic Prior Producer (RLPP). RLPP leverages Monte Carlo (MC) estimators with stochastic sampling operations to generate sufficiently-reliable priors by performing multiple inferences on the foundational medical image segmentation model, MedSAM. Additionally, instead of directly incorporating the priors in the spatial domain, we decompose the cross-attention (CA) mechanism into real symmetric and imaginary anti-symmetric parts via fast Fourier transform (FFT), resulting in the design of the Guided Frequency Cross-Attention (GFCA) solver. By leveraging the conjugated symmetric property of FFT, GFCA reduces the computational complexity of naive CA by nearly half. Extensive experimental results in various tasks demonstrate the superiority of the proposed LRformer in both effectiveness and efficiency.