DUAP: Dual-task Universal Adversarial Perturbations Against Voice Control Systems

📅 2026-01-19
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the limitation of existing adversarial attacks, which typically target either automatic speech recognition (ASR) or speaker recognition (SR) in isolation and fail to effectively disrupt the joint decision-making process in real-world voice-controlled systems. To overcome this, the authors propose Dual-Task Universal Adversarial Perturbations (DUAP), a method grounded in gradient analysis demonstrating no inherent conflict between ASR and SR tasks. DUAP employs a target-oriented surrogate loss function and a Dynamic Normalization Ensemble (DNE) strategy to enhance cross-model transferability, while leveraging psychoacoustic masking to ensure perceptual imperceptibility. Experiments across five ASR and six SR models show that DUAP achieves, for the first time, highly effective synchronous attacks on both ASR and SR systems, significantly outperforming single-task baselines in both attack success rate and auditory stealth.

Technology Category

Application Category

📝 Abstract
Modern Voice Control Systems (VCS) rely on the collaboration of Automatic Speech Recognition (ASR) and Speaker Recognition (SR) for secure interaction. However, prior adversarial attacks typically target these tasks in isolation, overlooking the coupled decision pipeline in real-world scenarios. Consequently, single-task attacks often fail to pose a practical threat. To fill this gap, we first utilize gradient analysis to reveal that ASR and SR exhibit no inherent conflicts. Building on this, we propose Dual-task Universal Adversarial Perturbation (DUAP). Specifically, DUAP employs a targeted surrogate objective to effectively disrupt ASR transcription and introduces a Dynamic Normalized Ensemble (DNE) strategy to enhance transferability across diverse SR models. Furthermore, we incorporate psychoacoustic masking to ensure perturbation imperceptibility. Extensive evaluations across five ASR and six SR models demonstrate that DUAP achieves high simultaneous attack success rates and superior imperceptibility, significantly outperforming existing single-task baselines.
Problem

Research questions and friction points this paper is trying to address.

Voice Control Systems
Adversarial Attacks
Automatic Speech Recognition
Speaker Recognition
Dual-task
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual-task Adversarial Attack
Universal Adversarial Perturbation
Dynamic Normalized Ensemble
Psychoacoustic Masking
Voice Control System Security
🔎 Similar Papers
No similar papers found.
S
Suyang Sun
Beijing University of Posts and Telecommunications
Weifei Jin
Weifei Jin
Beijing University of Posts and Telecommunications
Trustworthy AIAgent SafetyAdversarial MLSpeech Security
Yuxin Cao
Yuxin Cao
National University of Singapore, Tsinghua University
AI securityWSN
W
Wei Song
University of New South Wales
J
Jie Hao
Beijing University of Posts and Telecommunications