🤖 AI Summary
This work exposes a structural vulnerability of existing post-processing audio watermarking methods in speech scenarios: their reliance on low-level features renders them highly susceptible to common signal transformations—such as compression and filtering—leading to severe robustness degradation. To address this, the authors propose the first unified transformation-based evaluation framework specifically designed for speech watermarking, and introduce a prior-free black-box adversarial perturbation optimization method that enables efficient, perceptually lossless watermark removal. Experimental results demonstrate that, while preserving high speech naturalness (PESQ > 4.0), the attack achieves an average detection failure rate of 98.7% across mainstream watermarking schemes. These findings critically expose the fundamental limitations of the post-processing paradigm and provide both a rigorous benchmark and essential design guidance for next-generation end-to-end learnable watermarking systems.
📝 Abstract
In the audio modality, state-of-the-art watermarking methods leverage deep neural networks to allow the embedding of human-imperceptible signatures in generated audio. The ideal is to embed signatures that can be detected with high accuracy when the watermarked audio is altered via compression, filtering, or other transformations. Existing audio watermarking techniques operate in a post-hoc manner, manipulating"low-level"features of audio recordings after generation (e.g. through the addition of a low-magnitude watermark signal). We show that this post-hoc formulation makes existing audio watermarks vulnerable to transformation-based removal attacks. Focusing on speech audio, we (1) unify and extend existing evaluations of the effect of audio transformations on watermark detectability, and (2) demonstrate that state-of-the-art post-hoc audio watermarks can be removed with no knowledge of the watermarking scheme and minimal degradation in audio quality.