Self Voice Conversion as an Attack against Neural Audio Watermarking

📅 2026-01-28

📈 Citations: 0

✨ Influential: 0

career value

197K/year

🤖 AI Summary

This work addresses the vulnerability of existing neural audio watermarking methods under deep learning–based attacks, particularly their lack of robustness against transformations that preserve linguistic content and speaker identity while altering acoustic characteristics. The study introduces, for the first time, self-voice conversion into the watermarking attack paradigm, proposing a novel and generalizable attack framework that leverages deep learning models to map speech into an acoustic space with modified features yet identical speaker identity and semantics. Experimental results demonstrate that this approach substantially degrades the extraction accuracy of multiple state-of-the-art watermarking systems, confirming its effectiveness and broad applicability. The findings expose critical security weaknesses in current neural audio watermarking schemes and establish a new challenge for the design of robust watermarking mechanisms resilient to such sophisticated attacks.

Technology Category

Application Category

📝 Abstract

Audio watermarking embeds auxiliary information into speech while maintaining speaker identity, linguistic content, and perceptual quality. Although recent advances in neural and digital signal processing-based watermarking methods have improved imperceptibility and embedding capacity, robustness is still primarily assessed against conventional distortions such as compression, additive noise, and resampling. However, the rise of deep learning-based attacks introduces novel and significant threats to watermark security. In this work, we investigate self voice conversion as a universal, content-preserving attack against audio watermarking systems. Self voice conversion remaps a speaker's voice to the same identity while altering acoustic characteristics through a voice conversion model. We demonstrate that this attack severely degrades the reliability of state-of-the-art watermarking approaches and highlight its implications for the security of modern audio watermarking techniques.

Problem

Research questions and friction points this paper is trying to address.

audio watermarking

voice conversion

adversarial attack

self voice conversion

watermark security

Innovation

Methods, ideas, or system contributions that make the work stand out.

self voice conversion

audio watermarking

deep learning attack