Where is the Watermark? Interpretable Watermark Detection at the Block Level

📅 2025-12-16

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Existing image watermark detection methods are predominantly black-box, producing only global confidence scores without localization capability or interpretability, thereby hindering trustworthy provenance tracing. Method: We propose a post-hoc, block-level interpretable watermark detection framework. It introduces, for the first time, a statistical-driven block-level embedding strategy in the discrete wavelet transform (DWT) domain, coupled with a differentiable detection mapping to generate pixel-level saliency heatmaps. Contribution/Results: Our method jointly optimizes robustness and perceptual imperceptibility—ensuring watermark invisibility while achieving strong resilience against common attacks including cropping (≤50%), filtering, and compression. Crucially, it enables precise localization of semantically manipulated regions. By unifying global reliability with local verifiability, it overcomes the limitations of conventional black-box detectors and establishes a novel paradigm for trustworthy governance of generative AI content.

Technology Category

Application Category

📝 Abstract

Recent advances in generative AI have enabled the creation of highly realistic digital content, raising concerns around authenticity, ownership, and misuse. While watermarking has become an increasingly important mechanism to trace and protect digital media, most existing image watermarking schemes operate as black boxes, producing global detection scores without offering any insight into how or where the watermark is present. This lack of transparency impacts user trust and makes it difficult to interpret the impact of tampering. In this paper, we present a post-hoc image watermarking method that combines localised embedding with region-level interpretability. Our approach embeds watermark signals in the discrete wavelet transform domain using a statistical block-wise strategy. This allows us to generate detection maps that reveal which regions of an image are likely watermarked or altered. We show that our method achieves strong robustness against common image transformations while remaining sensitive to semantic manipulations. At the same time, the watermark remains highly imperceptible. Compared to prior post-hoc methods, our approach offers more interpretable detection while retaining competitive robustness. For example, our watermarks are robust to cropping up to half the image.

Problem

Research questions and friction points this paper is trying to address.

Develops a block-level interpretable watermark detection method

Addresses lack of transparency in existing image watermarking schemes

Enables localization of watermarks and tampered regions in images

Innovation

Methods, ideas, or system contributions that make the work stand out.

Localized embedding in wavelet transform domain

Block-wise statistical strategy for watermark detection

Generates interpretable region-level detection maps

🔎 Similar Papers

Is The Watermarking Of LLM-Generated Code Robust?