Imperceptible but Forgeable: Practical Invisible Watermark Forgery via Diffusion Models

📅 2025-03-28

📈 Citations: 0

✨ Influential: 0

career value

186K/year

🤖 AI Summary

This work exposes a critical security vulnerability in current invisible watermarking schemes for generative AI: existing methods are highly susceptible to forgery attacks under white-box (non-black-box) conditions. To address this, we propose DiffForge—the first practical watermark forgery framework capable of achieving high-fidelity, seamless invisible watermark injection without black-box assumptions. Its core innovations include: (1) a shallow-diffusion inversion mechanism guided by noise evolution dynamics, enabling adaptive control of watermark embedding depth; and (2) unconditional diffusion model-based watermark distribution estimation, integrated with shallow inversion and adaptive step selection. Experiments demonstrate that DiffForge achieves 96.38% evasion success against open-source watermark detectors and over 97% against leading commercial systems, while preserving imperceptibility and visual quality. These results critically expose the robustness bottleneck of prevailing watermarking paradigms.

Technology Category

Application Category

📝 Abstract

Invisible watermarking is critical for content provenance and accountability in Generative AI. Although commercial companies have increasingly committed to using watermarks, the robustness of existing watermarking schemes against forgery attacks is understudied. This paper proposes DiffForge, the first watermark forgery framework capable of forging imperceptible watermarks under a no-box setting. We estimate the watermark distribution using an unconditional diffusion model and introduce shallow inversion to inject the watermark into a non-watermarked image seamlessly. This approach facilitates watermark injection while preserving image quality by adaptively selecting the depth of inversion steps, leveraging our key insight that watermarks degrade with added noise during the early diffusion phases. Comprehensive evaluations show that DiffForge deceives open-source watermark detectors with a 96.38% success rate and misleads a commercial watermark system with over 97% success rate, achieving high confidence.1 This work reveals fundamental security limitations in current watermarking paradigms.

Problem

Research questions and friction points this paper is trying to address.

Forging imperceptible watermarks in Generative AI

Evaluating robustness of watermarking against no-box attacks

Exposing security flaws in commercial watermark systems

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses unconditional diffusion model for watermark estimation

Introduces shallow inversion for seamless watermark injection

Adaptively selects inversion depth to preserve image quality

🔎 Similar Papers

DiffuseTrace: A Transparent and Flexible Watermarking Scheme for Latent Diffusion Model