🤖 AI Summary
This work proposes a novel watermark removal framework that operates without image regeneration, addressing the challenge posed by existing semantic watermarks which exhibit strong robustness against spatial attacks and are difficult to remove without compromising semantic content. The method introduces imperceptible micro-geometric perturbations—inaudible to the human visual system—to disrupt the phase alignment of embedded watermarks, thereby enabling high-fidelity removal. It employs a mask-guided encoder to learn explicit spatial representations and integrates a 2D Gaussian splatting decoder to model geometric distortions. This design preserves semantic consistency while significantly enhancing both watermark removal efficacy and visual fidelity. Moreover, the framework supports efficient real-time inference, making it practical for real-world applications.
📝 Abstract
Semantic watermarks exhibit strong robustness against conventional image-space attacks. In this work, we show that such robustness does not survive under micro-geometric perturbations: spatial displacements can remove watermarks by breaking the phase alignment. Motivated by this observation, we introduce MarkCleaner, a watermark removal framework that avoids semantic drift caused by regeneration-based watermark removal. Specifically, MarkCleaner is trained with micro-geometry-perturbed supervision, which encourages the model to separate semantic content from strict spatial alignment and enables robust reconstruction under subtle geometric displacements. The framework adopts a mask-guided encoder that learns explicit spatial representations and a 2D Gaussian Splatting-based decoder that explicitly parameterizes geometric perturbations while preserving semantic content. Extensive experiments demonstrate that MarkCleaner achieves superior performance in both watermark removal effectiveness and visual fidelity, while enabling efficient real-time inference. Our code will be made available upon acceptance.