Speech Unlearning

📅 2025-06-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper introduces, for the first time, the machine unlearning problem in speech processing—aiming to efficiently erase the influence of specific samples (e.g., a single utterance) or classes (e.g., all data from a speaker) from speech models without full retraining, addressing privacy compliance, noisy/outdated data removal, and bias mitigation. Leveraging the high-dimensional temporal nature of speech and strong speaker dependencies, we: (1) formally define the speech unlearning task paradigm; (2) establish the first benchmark for speech unlearning, revealing its significantly greater difficulty compared to image or text unlearning; and (3) propose a lightweight unlearning method based on influence functions and gradient approximation, coupled with an unlearning-aware training strategy for speech models. Evaluated on keyword spotting and speaker identification tasks, our approach achieves <3% main-task accuracy degradation and <5% memorization recovery of unlearned data—substantially outperforming baselines and demonstrating both the feasibility and unique challenges of speech unlearning.

Technology Category

Application Category

📝 Abstract
We introduce machine unlearning for speech tasks, a novel and underexplored research problem that aims to efficiently and effectively remove the influence of specific data from trained speech models without full retraining. This has important applications in privacy preservation, removal of outdated or noisy data, and bias mitigation. While machine unlearning has been studied in computer vision and natural language processing, its application to speech is largely unexplored due to the high-dimensional, sequential, and speaker-dependent nature of speech data. We define two fundamental speech unlearning tasks: sample unlearning, which removes individual data points (e.g., a voice recording), and class unlearning, which removes an entire category (e.g., all data from a speaker), while preserving performance on the remaining data. Experiments on keyword spotting and speaker identification demonstrate that unlearning speech data is significantly more challenging than unlearning image or text data. We conclude with key future directions in this area, including structured training, robust evaluation, feature-level unlearning, broader applications, scalable methods, and adversarial robustness.
Problem

Research questions and friction points this paper is trying to address.

Efficiently remove specific data influence from trained speech models
Address privacy, outdated data, and bias in speech models
Define and tackle sample and class unlearning for speech data
Innovation

Methods, ideas, or system contributions that make the work stand out.

Machine unlearning for speech tasks
Sample and class unlearning techniques
Addresses high-dimensional speech data challenges
🔎 Similar Papers
No similar papers found.
Jiali Cheng
Jiali Cheng
UMass Lowell
Trustworthy AILanguage AgentsAI4Science
H
Hadi Amiri
University of Massachusetts Lowell, USA