Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising

📅 2025-02-10

📈 Citations: 0

✨ Influential: 0

career value

192K/year

🤖 AI Summary

Existing self-supervised single-image denoising methods rely on blind-spot networks or sub-image sampling, often causing structural distortions and detail loss; supervised approaches are hindered by the high cost of acquiring paired clean-noisy data. This paper proposes Prompt-SID, a pair-free self-supervised denoising framework. Its key contributions are: (1) the first structured prompt learning paradigm based on downsampled image pairs; (2) a latent diffusion model to generate robust structural representations; (3) a structural attention module enhancing geometric awareness in the Transformer decoder; and (4) a scale-replay training strategy to mitigate multi-scale modeling bias. Extensive experiments on synthetically noisy, real-world noisy, and fluorescence microscopy images demonstrate that Prompt-SID significantly outperforms state-of-the-art self-supervised and unsupervised methods—particularly in texture fidelity and edge/structure recovery—without requiring any clean reference data.

Technology Category

Application Category

📝 Abstract

Many studies have concentrated on constructing supervised models utilizing paired datasets for image denoising, which proves to be expensive and time-consuming. Current self-supervised and unsupervised approaches typically rely on blind-spot networks or sub-image pairs sampling, resulting in pixel information loss and destruction of detailed structural information, thereby significantly constraining the efficacy of such methods. In this paper, we introduce Prompt-SID, a prompt-learning-based single image denoising framework that emphasizes preserving of structural details. This approach is trained in a self-supervised manner using downsampled image pairs. It captures original-scale image information through structural encoding and integrates this prompt into the denoiser. To achieve this, we propose a structural representation generation model based on the latent diffusion process and design a structural attention module within the transformer-based denoiser architecture to decode the prompt. Additionally, we introduce a scale replay training mechanism, which effectively mitigates the scale gap from images of different resolutions. We conduct comprehensive experiments on synthetic, real-world, and fluorescence imaging datasets, showcasing the remarkable effectiveness of Prompt-SID.

Problem

Research questions and friction points this paper is trying to address.

Single-image denoising

Structural detail preservation

Self-supervised learning framework

Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-learning-based denoising framework

Latent diffusion structural representation

Transformer-based structural attention module

🔎 Similar Papers

PromptRR: Diffusion Models as Prompt Generators for Single Image Reflection Removal