🤖 AI Summary
To address the trade-off between image quality and radiation safety in low-dose CT imaging, as well as the challenges of unknown degradation processes and limited training data in CT super-resolution, this work pioneers the adaptation of Stable Diffusion to blind super-resolution. We propose a degradation-aware learnable CT degradation modeling mechanism, integrated with CLIP-driven visual-linguistic description generation and a dual-condition (low-resolution image + text) controllable diffusion sampling strategy. This multimodal conditional denoising diffusion framework achieves state-of-the-art performance across multiple CT datasets—improving PSNR by 2.1 dB and SSIM by 0.032—while enabling sub-mGy low-dose reconstruction. Our approach establishes a novel paradigm for clinically safe, high-fidelity CT imaging.
📝 Abstract
High-resolution computed tomography (CT) imaging is essential for medical diagnosis but requires increased radiation exposure, creating a critical trade-off between image quality and patient safety. While deep learning methods have shown promise in CT super-resolution, they face challenges with complex degradations and limited medical training data. Meanwhile, large-scale pre-trained diffusion models, particularly Stable Diffusion, have demonstrated remarkable capabilities in synthesizing fine details across various vision tasks. Motivated by this, we propose a novel framework that adapts Stable Diffusion for CT blind super-resolution. We employ a practical degradation model to synthesize realistic low-quality images and leverage a pre-trained vision-language model to generate corresponding descriptions. Subsequently, we perform super-resolution using Stable Diffusion with a specialized controlling strategy, conditioned on both low-resolution inputs and the generated text descriptions. Extensive experiments show that our method outperforms existing approaches, demonstrating its potential for achieving high-quality CT imaging at reduced radiation doses. Our code will be made publicly available.