Prompt-SID: Learning Structural Representation Prompt via Latent Diffusion for Single-Image Denoising

📅 2025-02-10
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing self-supervised single-image denoising methods rely on blind-spot networks or sub-image sampling, often causing structural distortions and detail loss; supervised approaches are hindered by the high cost of acquiring paired clean-noisy data. This paper proposes Prompt-SID, a pair-free self-supervised denoising framework. Its key contributions are: (1) the first structured prompt learning paradigm based on downsampled image pairs; (2) a latent diffusion model to generate robust structural representations; (3) a structural attention module enhancing geometric awareness in the Transformer decoder; and (4) a scale-replay training strategy to mitigate multi-scale modeling bias. Extensive experiments on synthetically noisy, real-world noisy, and fluorescence microscopy images demonstrate that Prompt-SID significantly outperforms state-of-the-art self-supervised and unsupervised methods—particularly in texture fidelity and edge/structure recovery—without requiring any clean reference data.

Technology Category

Application Category

📝 Abstract
Many studies have concentrated on constructing supervised models utilizing paired datasets for image denoising, which proves to be expensive and time-consuming. Current self-supervised and unsupervised approaches typically rely on blind-spot networks or sub-image pairs sampling, resulting in pixel information loss and destruction of detailed structural information, thereby significantly constraining the efficacy of such methods. In this paper, we introduce Prompt-SID, a prompt-learning-based single image denoising framework that emphasizes preserving of structural details. This approach is trained in a self-supervised manner using downsampled image pairs. It captures original-scale image information through structural encoding and integrates this prompt into the denoiser. To achieve this, we propose a structural representation generation model based on the latent diffusion process and design a structural attention module within the transformer-based denoiser architecture to decode the prompt. Additionally, we introduce a scale replay training mechanism, which effectively mitigates the scale gap from images of different resolutions. We conduct comprehensive experiments on synthetic, real-world, and fluorescence imaging datasets, showcasing the remarkable effectiveness of Prompt-SID.
Problem

Research questions and friction points this paper is trying to address.

Single-image denoising
Structural detail preservation
Self-supervised learning framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Prompt-learning-based denoising framework
Latent diffusion structural representation
Transformer-based structural attention module
🔎 Similar Papers
No similar papers found.