🤖 AI Summary
This work proposes an efficient watermarking mechanism tailored for diffusion language models (DLMs), which operate via non-sequential iterative denoising and are poorly served by existing watermarking approaches designed primarily for autoregressive models. The method introduces a novel bidirectional context-aware embedding strategy that leverages left and right neighboring tokens to apply a lightweight bias during token generation. This approach enables strong statistical traceability with high detectability while incurring negligible runtime overhead and memory footprint. Empirical evaluations demonstrate that the watermarked model maintains generation efficiency comparable to the original, while significantly enhancing both robustness and imperceptibility of the embedded watermark.
📝 Abstract
Watermarking (WM) is a critical mechanism for detecting and attributing AI-generated content. Current WM methods for Large Language Models (LLMs) are predominantly tailored for autoregressive (AR) models: They rely on tokens being generated sequentially, and embed stable signals within the generated sequence based on the previously sampled text. Diffusion Language Models (DLMs) generate text via non-sequential iterative denoising, which requires significant modification to use WM methods designed for AR models. Recent work proposed to watermark DLMs by inverting the process when needed, but suffers significant computational or memory overhead. We introduce Left-Right Diffusion Watermarking (LR-DWM), a scheme that biases the generated token based on both left and right neighbors, when they are available. LR-DWM incurs minimal runtime and memory overhead, remaining close to the non-watermarked baseline DLM while enabling reliable statistical detection under standard evaluation settings. Our results demonstrate that DLMs can be watermarked efficiently, achieving high detectability with negligible computational and memory overhead.