🤖 AI Summary
Existing watermarking methods primarily target autoregressive large language models (LLMs) and continuous diffusion image models, leaving discrete diffusion language models (DDLMS) unaddressed.
Method: We propose the first verifiable, distortion-free watermarking mechanism for DDLMS. At each discrete diffusion step, we generate a deterministic random seed based on the sequence index and employ distribution-preserving Gumbel-max sampling—ensuring the marginal output distribution remains strictly unchanged. This tightly couples watermark embedding with the diffusion process.
Contribution/Results: We provide a theoretical guarantee that the false positive rate decays exponentially with sequence length. Experiments across mainstream DDLMS demonstrate high detection accuracy, strong robustness against common perturbations, and zero impact on generation quality—without requiring model fine-tuning or architectural modification. To our knowledge, this is the first rigorously provable and practically deployable watermarking solution for DDLMS, filling a critical gap in AI-generated text provenance.
📝 Abstract
Watermarking has emerged as a promising technique to track AI-generated content and differentiate it from authentic human creations. While prior work extensively studies watermarking for autoregressive large language models (LLMs) and image diffusion models, none address discrete diffusion language models, which are becoming popular due to their high inference throughput. In this paper, we introduce the first watermarking method for discrete diffusion models by applying the distribution-preserving Gumbel-max trick at every diffusion step and seeding the randomness with the sequence index to enable reliable detection. We experimentally demonstrate that our scheme is reliably detectable on state-of-the-art diffusion language models and analytically prove that it is distortion-free with an exponentially decaying probability of false detection in the token sequence length.