Watermarking Autoregressive Image Generation

📅 2025-06-19

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This work addresses the challenge of provenance attribution for autoregressive image generation models by proposing the first end-to-end token-level watermarking embedding and detection framework. The core difficulty lies in the absence of reverse cycle consistency (RCC): re-encoding image tokens significantly perturbs the sequence, causing conventional watermarks to fail. To overcome this, we introduce an RCC-enhanced joint fine-tuning mechanism for tokenizer and detokenizer, coupled with a differentiable watermark synchronization layer—adapting language-model watermarking principles to the VQ-VAE latent domain. Our method employs customized tokenizer fine-tuning and statistically rigorous significance testing (with theoretical p-value guarantees) to ensure robust embedding. Evaluated across multiple autoregressive image models, it achieves >99% detection accuracy and demonstrates strong resilience against JPEG compression, cropping, scaling, and geometric transformations—effectively resolving the critical vulnerability of token-level watermarks to erasure.

Technology Category

Application Category

📝 Abstract

Watermarking the outputs of generative models has emerged as a promising approach for tracking their provenance. Despite significant interest in autoregressive image generation models and their potential for misuse, no prior work has attempted to watermark their outputs at the token level. In this work, we present the first such approach by adapting language model watermarking techniques to this setting. We identify a key challenge: the lack of reverse cycle-consistency (RCC), wherein re-tokenizing generated image tokens significantly alters the token sequence, effectively erasing the watermark. To address this and to make our method robust to common image transformations, neural compression, and removal attacks, we introduce (i) a custom tokenizer-detokenizer finetuning procedure that improves RCC, and (ii) a complementary watermark synchronization layer. As our experiments demonstrate, our approach enables reliable and robust watermark detection with theoretically grounded p-values.

Problem

Research questions and friction points this paper is trying to address.

Watermarking autoregressive image generation models

Addressing token sequence alteration during re-tokenization

Ensuring robustness against image transformations and attacks

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adapts language model watermarking to autoregressive image generation

Introduces tokenizer-detokenizer finetuning for reverse cycle-consistency

Adds watermark synchronization layer for robustness against attacks

🔎 Similar Papers

LaWa: Using Latent Space for In-Generation Image Watermarking