Watermarking Diffusion Language Models

📅 2025-09-29

📈 Citations: 0

✨ Influential: 0

career value

201K/year

🤖 AI Summary

Existing watermarking schemes rely on autoregressive (AR) context generation and thus struggle to accommodate diffusion language models (DLMs), which generate tokens non-autoregressively and in arbitrary order. This work proposes the first DLM-specific watermarking method: during the forward diffusion process, watermark signals are modulated based on token-prediction distributions and context-aware probabilistic modeling; crucially, watermark embedding is performed in expectation over partially determined intermediate states—an innovation enabling robustness without compromising generation fidelity. Contextual token selection is further enhanced to improve resilience against perturbations. The method requires no detector modification, preserves generation quality nearly intact (BLEU and perplexity degradation <0.5%), achieves >99% true positive detection rate, and matches state-of-the-art AR-based watermarking schemes in robustness against pruning, synonym substitution, and other adversarial edits—establishing, for the first time, an efficient and robust solution for DLM text provenance.

Technology Category

Application Category

📝 Abstract

We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.

Problem

Research questions and friction points this paper is trying to address.

Developing the first watermark for diffusion language models with non-sequential generation

Addressing the challenge of watermarking without relying on previously generated tokens

Enabling reliable detection while maintaining text quality and robustness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Watermark designed for non-autoregressive language models

Applies watermark in expectation over incomplete context tokens

Promotes tokens increasing watermark strength as context

🔎 Similar Papers

From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models