🤖 AI Summary
Existing watermarking schemes rely on autoregressive (AR) context generation and thus struggle to accommodate diffusion language models (DLMs), which generate tokens non-autoregressively and in arbitrary order. This work proposes the first DLM-specific watermarking method: during the forward diffusion process, watermark signals are modulated based on token-prediction distributions and context-aware probabilistic modeling; crucially, watermark embedding is performed in expectation over partially determined intermediate states—an innovation enabling robustness without compromising generation fidelity. Contextual token selection is further enhanced to improve resilience against perturbations. The method requires no detector modification, preserves generation quality nearly intact (BLEU and perplexity degradation <0.5%), achieves >99% true positive detection rate, and matches state-of-the-art AR-based watermarking schemes in robustness against pruning, synonym substitution, and other adversarial edits—establishing, for the first time, an efficient and robust solution for DLM text provenance.
📝 Abstract
We introduce the first watermark tailored for diffusion language models (DLMs), an emergent LLM paradigm able to generate tokens in arbitrary order, in contrast to standard autoregressive language models (ARLMs) which generate tokens sequentially. While there has been much work in ARLM watermarking, a key challenge when attempting to apply these schemes directly to the DLM setting is that they rely on previously generated tokens, which are not always available with DLM generation. In this work we address this challenge by: (i) applying the watermark in expectation over the context even when some context tokens are yet to be determined, and (ii) promoting tokens which increase the watermark strength when used as context for other tokens. This is accomplished while keeping the watermark detector unchanged. Our experimental evaluation demonstrates that the DLM watermark leads to a >99% true positive rate with minimal quality impact and achieves similar robustness to existing ARLM watermarks, enabling for the first time reliable DLM watermarking.