TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Image Denoising

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing blind-spot networks in real-world sRGB image denoising, which struggle to model spatially correlated noise introduced by camera ISP pipelines—particularly demosaicing—due to their assumption of noise independence. To overcome this, the authors propose a novel blind-spot network that introduces triangular masked convolution for the first time, enabling the construction of a diamond-shaped blind region aligned with the structural characteristics of demosaicing-induced noise. This design allows self-supervised denoising at full resolution without requiring downsampling or post-processing. Furthermore, multi-prediction knowledge distillation is integrated to enhance the performance of a lightweight U-Net architecture. The method achieves state-of-the-art self-supervised results on multiple real-image denoising benchmarks, significantly outperforming current approaches.
📝 Abstract
Blind-spot networks (BSNs) enable self-supervised image denoising by preventing access to the target pixel, allowing clean signal estimation without ground-truth supervision. However, this approach assumes pixel-wise noise independence, which is violated in real-world sRGB images due to spatially correlated noise from the camera's image signal processing (ISP) pipeline. While several methods employ downsampling to decorrelate noise, they alter noise statistics and limit the network's ability to utilize full contextual information. In this paper, we propose the Triangular-Masked Blind-Spot Network (TM-BSN), a novel blind-spot architecture that accurately models the spatial correlation of real sRGB noise. This correlation originates from demosaicing, where each pixel is reconstructed from neighboring samples with spatially decaying weights, resulting in a diamond-shaped pattern. To align the receptive field with this geometry, we introduce a triangular-masked convolution that restricts the kernel to its upper-triangular region, creating a diamond-shaped blind spot at the original resolution. This design excludes correlated pixels while fully leveraging uncorrelated context, eliminating the need for downsampling or post-processing. Furthermore, we use knowledge distillation to transfer complementary knowledge from multiple blind-spot predictions into a lightweight U-Net, improving both accuracy and efficiency. Extensive experiments on real-world benchmarks demonstrate that our method achieves state-of-the-art performance, significantly outperforming existing self-supervised approaches. Our code is available at https://github.com/parkjun210/TM-BSN.
Problem

Research questions and friction points this paper is trying to address.

blind-spot network
image denoising
spatially correlated noise
self-supervised learning
sRGB images
Innovation

Methods, ideas, or system contributions that make the work stand out.

blind-spot network
triangular-masked convolution
spatially correlated noise
self-supervised denoising
knowledge distillation
🔎 Similar Papers
2024-04-11AAAI Conference on Artificial IntelligenceCitations: 6
J
Junyoung Park
Department of ECE, INMC, Seoul National University, South Korea
Y
Youngjin Oh
Department of ECE, INMC, Seoul National University, South Korea
Nam Ik Cho
Nam Ik Cho
Seoul National University, Dept. of Electrical and Computer Engineering
Image ProcessingSignal ProcessingAdaptive FilteringComputer Vision