Protecting Your Voice: Temporal-aware Robust Watermarking

📅 2025-04-21
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the proliferation of generative speech forgeries, existing frequency-domain watermarking methods achieve robustness at the cost of temporal fine-grained features, severely degrading speech fidelity. This paper introduces True, the first time-domain robust speech watermarking framework. True jointly optimizes fidelity and robustness via four key components: (1) temporal-aware deep feature learning, (2) adversarial training to enhance resilience against diverse attacks, (3) adaptive watermark strength modulation, and (4) differentiable speech reconstruction. Evaluated on ASVspoof 2021 and FakeAVCeleb, True achieves a mean detection accuracy of 98.7% and a MOS score of 4.62 for naturalness—substantially outperforming state-of-the-art frequency-domain approaches. To our knowledge, True is the first method to simultaneously achieve high perceptual fidelity and strong time-domain robustness, establishing a new paradigm for secure and transparent speech watermarking.

Technology Category

Application Category

📝 Abstract
The rapid advancement of generative models has led to the synthesis of real-fake ambiguous voices. To erase the ambiguity, embedding watermarks into the frequency-domain features of synthesized voices has become a common routine. However, the robustness achieved by choosing the frequency domain often comes at the expense of fine-grained voice features, leading to a loss of fidelity. Maximizing the comprehensive learning of time-domain features to enhance fidelity while maintaining robustness, we pioneer a extbf{underline{t}}emporal-aware extbf{underline{r}}ob extbf{underline{u}}st wat extbf{underline{e}}rmarking (emph{True}) method for protecting the speech and singing voice.
Problem

Research questions and friction points this paper is trying to address.

Detect synthesized voices with real-fake ambiguity
Balance robustness and fidelity in voice watermarking
Enhance time-domain features for better watermark performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Temporal-aware robust watermarking method
Enhances fidelity with time-domain features
Maintains robustness in synthesized voices
🔎 Similar Papers
No similar papers found.
Y
Yue Li
College of Computer Science and Technology, National Huaqiao University, Xiamen 361021, China, and also with the Xiamen Key Laboratory of Data Security and Blockchain Technology, Xiamen 361021, China
Weizhi Liu
Weizhi Liu
华东师范大学
AIGC securityGenerative watermarking
D
Dongdong Lin