SemBind: Binding Diffusion Watermarks to Semantics Against Black-Box Forgery Attacks

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

193K/year

🤖 AI Summary

This work addresses the vulnerability of existing latent-space watermarks to black-box forgery attacks, wherein adversaries can embed legitimate-looking watermarks into unauthorized generated images, thereby compromising source authenticity. To counter this, we propose the first defense framework specifically designed to resist black-box forgery in latent watermarking. Our approach introduces a semantic binding mechanism that leverages a contrastive learning–trained semantic masker to produce encodings that are nearly invariant under the same prompt and approximately orthogonal across different prompts. These encodings are integrated with reshaping and permutation-based modulation of the latent representation to enable prompt-aware watermark discrimination. Evaluated on four mainstream watermarking schemes, our method substantially reduces false acceptance rates while allowing flexible trade-offs between security and robustness through adjustable masking ratios, all with negligible impact on image quality.

Technology Category

Application Category

📝 Abstract

Latent-based watermarks, integrated into the generation process of latent diffusion models (LDMs), simplify detection and attribution of generated images. However, recent black-box forgery attacks, where an attacker needs at least one watermarked image and black-box access to the provider's model, can embed the provider's watermark into images not produced by the provider, posing outsized risk to provenance and trust. We propose SemBind, the first defense framework for latent-based watermarks that resists black-box forgery by binding latent signals to image semantics via a learned semantic masker. Trained with contrastive learning, the masker yields near-invariant codes for the same prompt and near-orthogonal codes across prompts; these codes are reshaped and permuted to modulate the target latent before any standard latent-based watermark. SemBind is generally compatible with existing latent-based watermarking schemes and keeps image quality essentially unchanged, while a simple mask-ratio parameter offers a tunable trade-off between anti-forgery strength and robustness. Across four mainstream latent-based watermark methods, our SemBind-enabled anti-forgery variants markedly reduce false acceptance under black-box forgery while providing a controllable robustness-security balance.

Problem

Research questions and friction points this paper is trying to address.

black-box forgery

latent-based watermark

image provenance

diffusion models

watermark security

Innovation

Methods, ideas, or system contributions that make the work stand out.

SemBind

latent diffusion models

black-box forgery