FreeTalk:A plug-and-play and black-box defense against speech synthesis attacks

📅 2025-08-30
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To mitigate voiceprint privacy leakage caused by zero-shot voice cloning attacks, this paper proposes a lightweight, plug-and-play black-box defense operating in the frequency domain. The method generates speaker-level universal frequency-domain perturbations, integrated with data augmentation and noise-smoothing mechanisms, enabling efficient protection for arbitrary-length speech while preserving high intelligibility and functional utility. Unlike existing approaches, our method exhibits strong cross-model transferability, high robustness against diverse synthesis models, and minimal computational overhead. Extensive experiments are conducted across five text-to-speech models, five speaker verification models, one automatic speech recognition (ASR) model, and two benchmark datasets. Results demonstrate significant improvements in privacy protection efficacy without compromising speech quality or practical usability, confirming its viability for real-world deployment.

Technology Category

Application Category

📝 Abstract
Recently, speech assistant and speech verification have been used in many fields, which brings much benefit and convenience for us. However, when we enjoy these speech applications, our speech may be collected by attackers for speech synthesis. For example, an attacker generates some inappropriate political opinions with the characteristic of the victim's voice by obtaining a piece of the victim's speech, which will greatly influence the victim's reputation. Specifically, with the appearance of some zero-shot voice conversion methods, the cost of speech synthesis attacks has been further reduced, which also brings greater challenges to user voice security and privacy. Some researchers have proposed the corresponding privacy-preserving methods. However, the existing approaches have some non-negligible drawbacks: low transferability and robustness, high computational overhead. These deficiencies seriously limit the existing method deployed in practical scenarios. Therefore, in this paper, we propose a lightweight, robust, plug-and-play privacy preservation method against speech synthesis attacks in a black-box setting. Our method generates and adds a frequency-domain perturbation to the original speech to achieve privacy protection and high speech quality. Then, we present a data augmentation strategy and noise smoothing mechanism to improve the robustness of the proposed method. Besides, to reduce the user's defense overhead, we also propose a novel identity-wise protection mechanism. It can generate a universal perturbation for one speaker and support privacy preservation for speech of any length. Finally, we conduct extensive experiments on 5 speech synthesis models, 5 speech verification models, 1 speech recognition model, and 2 datasets. The experimental results demonstrate that our method has satisfying privacy-preserving performance, high speech quality, and utility.
Problem

Research questions and friction points this paper is trying to address.

Defending against zero-shot speech synthesis attacks
Addressing low transferability and robustness issues
Reducing computational overhead in privacy preservation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Frequency-domain perturbation for privacy protection
Data augmentation and noise smoothing for robustness
Identity-wise universal perturbation for any speech length
🔎 Similar Papers
No similar papers found.
Y
Yuwen Pu
School of Big Data & Software Engineering, Chongqing University, Chongqing, 400030, China
Zhou Feng
Zhou Feng
Zhejiang University
AI Security
Chunyi Zhou
Chunyi Zhou
Zhejiang University
Cyberspace SecurityMachine Learning PrivacyFederated Learning
J
Jiahao Chen
College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang, 310027, China
Chunqiang Hu
Chunqiang Hu
Professor of Big Data & Software Engineering, Chongqing University.
Data-Driven Security and PrivacyAlgorithm Design and Analysis
H
Haibo Hu
School of Big Data & Software Engineering, Chongqing University, Chongqing, 400030, China
Shouling Ji
Shouling Ji
Professor, Zhejiang University & Georgia Institute of Technology
Data-driven SecurityAI SecuritySoftware ScurityPrivacy