🤖 AI Summary
This work addresses the insufficient validation of robustness in existing deepfake defensive watermarking schemes under black-box settings. We propose DeMark, a query-free black-box attack framework that, for the first time, exposes the vulnerability of encoder-decoder-based watermarking models in the latent space. DeMark leverages a compressive sensing–driven sparsification strategy to effectively attenuate watermark signals in the latent representation while preserving visual image quality. Extensive experiments demonstrate that DeMark reduces the detection accuracy of eight state-of-the-art watermarking methods from 100% to an average of 32.9%, substantially outperforming existing attack approaches—all without requiring access to the target model or any query feedback.
📝 Abstract
The rapid proliferation of realistic deepfakes has raised urgent concerns over their misuse, motivating the use of defensive watermarks in synthetic images for reliable detection and provenance tracking. However, this defense paradigm assumes such watermarks are inherently resistant to removal. We challenge this assumption with DeMark, a query-free black-box attack framework that targets defensive image watermarking schemes for deepfakes. DeMark exploits latent-space vulnerabilities in encoder-decoder watermarking models through a compressive sensing based sparsification process, suppressing watermark signals while preserving perceptual and structural realism appropriate for deepfakes. Across eight state-of-the-art watermarking schemes, DeMark reduces watermark detection accuracy from 100% to 32.9% on average while maintaining natural visual quality, outperforming existing attacks. We further evaluate three defense strategies, including image super resolution, sparse watermarking, and adversarial training, and find them largely ineffective. These results demonstrate that current encoder decoder watermarking schemes remain vulnerable to latent-space manipulations, underscoring the need for more robust watermarking methods to safeguard against deepfakes.