Compositional Adversarial Training for Robust Visual Watermarking

📅 2026-05-15

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This work addresses the limitations of conventional robust watermarking training, which relies on random post-processing augmentations that inadequately cover the space of real-world composite attacks, leading to unstable training and low sample efficiency. To overcome this, the authors propose a plug-and-play Compositional Adversarial Training (CAT) framework that formulates watermark robustness as a min-max optimization problem over a space of composite transformations. CAT employs a differentiable sequential adversarial strategy to dynamically select attack types that maximize information destruction. By integrating straight-through Gumbel-Softmax with entropy regularization, the method enables end-to-end differentiable training and gradient aggregation across diverse attack families, preventing convergence to a single attack mode. Experiments demonstrate that CAT significantly outperforms random augmentation baselines on both image and video benchmarks, achieving up to a 63.5% increase in watermark capacity under single-step attacks, a 13.0% gain under composite attacks, and an average 12% improvement in TPR@FPR=1% against geometric transformations.

📝 Abstract

Robust watermarking is typically trained with random post-processing augmentation, but random sampling under-covers the combinatorial space of realistic attack pipelines and rarely encounters the rare compositions that actually break detection. This leads to unstable training and poor sample efficiency. We instead formulate watermark robustness as a min-max problem over a structured space of compositional transformations. We propose Compositional Adversarial Training (CAT), a plug-in framework that learns a sequential differentiable adversary that observes the current watermarked image and selects an attack family at each step to maximally disrupt message recovery. CAT combines a straight-through Gumbel-Softmax attack selection with entropy regularization, allowing the backward pass to be end-to-end differentiable and aggregate gradient information across attack families, yielding faster, smoother convergence without collapsing to a single attack mode. We evaluate CAT on post-generation watermarks VideoSeal 0.0, VideoSeal 1.0, and PixelSeal and in-generation WMAR under both single-step and two-step attack suites, on in-distribution and multiple out-of-distribution image and video benchmarks. CAT consistently outperforms random-augmentation baselines trained with the same augmentation budget, with the largest gains on hard composed attacks and OOD evaluations; improving overall watermark capacity by up to $63.5\%$ in the single-step attack setting and $13.0\%$ in the compositional setting. In the autoregressive setting, CAT improves the TPR@FPR$=1\%$ by $12\%$ on average on difficult geometric transformations. These results show that robust visual watermarking benefits from training against adaptive compositional adversaries rather than independent random corruptions.

Problem

Research questions and friction points this paper is trying to address.

robust watermarking

compositional attacks

adversarial training

sample efficiency

attack pipelines

Innovation

Methods, ideas, or system contributions that make the work stand out.

Compositional Adversarial Training

robust visual watermarking

differentiable adversary