TM-BSN: Triangular-Masked Blind-Spot Network for Real-World Self-Supervised Image Denoising

📅 2026-04-06

📈 Citations: 0

✨ Influential: 0

career value

205K/year

🤖 AI Summary

This work addresses the limitation of existing blind-spot networks in real-world sRGB image denoising, which struggle to model spatially correlated noise introduced by camera ISP pipelines—particularly demosaicing—due to their assumption of noise independence. To overcome this, the authors propose a novel blind-spot network that introduces triangular masked convolution for the first time, enabling the construction of a diamond-shaped blind region aligned with the structural characteristics of demosaicing-induced noise. This design allows self-supervised denoising at full resolution without requiring downsampling or post-processing. Furthermore, multi-prediction knowledge distillation is integrated to enhance the performance of a lightweight U-Net architecture. The method achieves state-of-the-art self-supervised results on multiple real-image denoising benchmarks, significantly outperforming current approaches.

Technology Category

Application Category

📝 Abstract

Blind-spot networks (BSNs) enable self-supervised image denoising by preventing access to the target pixel, allowing clean signal estimation without ground-truth supervision. However, this approach assumes pixel-wise noise independence, which is violated in real-world sRGB images due to spatially correlated noise from the camera's image signal processing (ISP) pipeline. While several methods employ downsampling to decorrelate noise, they alter noise statistics and limit the network's ability to utilize full contextual information. In this paper, we propose the Triangular-Masked Blind-Spot Network (TM-BSN), a novel blind-spot architecture that accurately models the spatial correlation of real sRGB noise. This correlation originates from demosaicing, where each pixel is reconstructed from neighboring samples with spatially decaying weights, resulting in a diamond-shaped pattern. To align the receptive field with this geometry, we introduce a triangular-masked convolution that restricts the kernel to its upper-triangular region, creating a diamond-shaped blind spot at the original resolution. This design excludes correlated pixels while fully leveraging uncorrelated context, eliminating the need for downsampling or post-processing. Furthermore, we use knowledge distillation to transfer complementary knowledge from multiple blind-spot predictions into a lightweight U-Net, improving both accuracy and efficiency. Extensive experiments on real-world benchmarks demonstrate that our method achieves state-of-the-art performance, significantly outperforming existing self-supervised approaches. Our code is available at https://github.com/parkjun210/TM-BSN.

Problem

Research questions and friction points this paper is trying to address.

blind-spot network

image denoising

spatially correlated noise

self-supervised learning

sRGB images

Innovation

Methods, ideas, or system contributions that make the work stand out.

blind-spot network

triangular-masked convolution

spatially correlated noise