🤖 AI Summary
Robustness and removability of digital watermarks are fundamentally contradictory: existing methods struggle to completely erase watermarks while preserving content fidelity. This paper proposes the first diffusion Transformer-based dual-path token-guided watermark removal framework, modeling the task as a conditional generation process jointly driven by visual tokens (encoding textural details) and structural tokens (encoding geometric layout), thereby bypassing explicit watermark-noise modeling and eliminating the fidelity–completeness trade-off at its source. Our approach unifies watermark removal with structure-aware reconstruction—the first such method to achieve this integration. It significantly outperforms state-of-the-art methods across multiple benchmarks, achieving superior performance in PSNR, LPIPS, and human perceptual evaluation. Notably, it maintains exceptional geometric consistency and visual realism even under challenging conditions involving complex textures and geometric deformations.
📝 Abstract
In the digital economy era, digital watermarking serves as a critical basis for ownership proof of massive replicable content, including AI-generated and other virtual assets. Designing robust watermarks capable of withstanding various attacks and processing operations is even more paramount. We introduce TokenPure, a novel Diffusion Transformer-based framework designed for effective and consistent watermark removal. TokenPure solves the trade-off between thorough watermark destruction and content consistency by leveraging token-based conditional reconstruction. It reframes the task as conditional generation, entirely bypassing the initial watermark-carrying noise. We achieve this by decomposing the watermarked image into two complementary token sets: visual tokens for texture and structural tokens for geometry. These tokens jointly condition the diffusion process, enabling the framework to synthesize watermark-free images with fine-grained consistency and structural integrity. Comprehensive experiments show that TokenPure achieves state-of-the-art watermark removal and reconstruction fidelity, substantially outperforming existing baselines in both perceptual quality and consistency.