🤖 AI Summary
Self-supervised denoising of real-world images faces a fundamental trade-off between noise decorrelation and high-frequency detail preservation: existing blind-spot networks (BSNs) rely on pixel-shuffling downsampling (PD), but aggressive downsampling degrades structural integrity, while mild downsampling fails to fully decorrelate noise. We propose a cross-scale prediction paradigm that decouples noise decorrelation from detail preservation—using low-resolution, fully noise-decorrelated sub-images as input to predict high-resolution, structurally intact clean images. Our method replaces PD with controllable-scale mapping for effective noise decorrelation and constructs cross-scale training pairs based on BSNs. The framework naturally supports noise-image super-resolution without additional training or architectural modification. Evaluated on real-world benchmarks, it achieves state-of-the-art self-supervised denoising performance, significantly alleviating the long-standing noise-detail trade-off.
📝 Abstract
Self-supervised real-world image denoising remains a fundamental challenge, arising from the antagonistic trade-off between decorrelating spatially structured noise and preserving high-frequency details. Existing blind-spot network (BSN) methods rely on pixel-shuffle downsampling (PD) to decorrelate noise, but aggressive downsampling fragments fine structures, while milder downsampling fails to remove correlated noise. To address this, we introduce Next-Scale Prediction (NSP), a novel self-supervised paradigm that decouples noise decorrelation from detail preservation. NSP constructs cross-scale training pairs, where BSN takes low-resolution, fully decorrelated sub-images as input to predict high-resolution targets that retain fine details. As a by-product, NSP naturally supports super-resolution of noisy images without retraining or modification. Extensive experiments demonstrate that NSP achieves state-of-the-art self-supervised denoising performance on real-world benchmarks, significantly alleviating the long-standing conflict between noise decorrelation and detail preservation.