🤖 AI Summary
Machine unlearning for speech emotion recognition (SER) faces challenges when privacy regulations mandate deletion of specific samples, as existing methods require access to retained data—rendering them inapplicable in scenarios where such data cannot be redistributed.
Method: We propose the first adversarial unlearning method that requires only the to-be-forgotten samples—not any retained data—to erase model knowledge. By introducing adversarial attack principles to SER unlearning, our approach employs gradient-guided fine-tuning on a pre-trained model to achieve selective knowledge erasure.
Contribution/Results: This framework significantly reduces computational overhead and mitigates secondary privacy leakage risks. Experiments demonstrate >92% forgetting success rate—i.e., effective elimination of model memory regarding forgotten samples—while preserving primary SER performance with <1.5% accuracy degradation. Thus, our method achieves both rigorous unlearning completeness and practical model utility.
📝 Abstract
Speech emotion recognition aims to identify emotional states from speech signals and has been widely applied in human-computer interaction, education, healthcare, and many other fields. However, since speech data contain rich sensitive information, partial data can be required to be deleted by speakers due to privacy concerns. Current machine unlearning approaches largely depend on data beyond the samples to be forgotten. However, this reliance poses challenges when data redistribution is restricted and demands substantial computational resources in the context of big data. We propose a novel adversarial-attack-based approach that fine-tunes a pre-trained speech emotion recognition model using only the data to be forgotten. The experimental results demonstrate that the proposed approach can effectively remove the knowledge of the data to be forgotten from the model, while preserving high model performance on the test set for emotion recognition.