Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy

📅 2025-07-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Approximate machine unlearning algorithms inherently retain implicit residual information, leading to membership privacy leakage of forgotten data. This paper demonstrates that such residuals are pervasive across existing methods and introduces a novel Recall Attack (ReA), achieving 1.90× and 1.12× higher accuracy than baseline attacks in class-level and sample-level membership inference, respectively. To address this vulnerability, we propose the first residual-elimination-oriented two-stage unlearning framework, integrating fine-grained residual analysis, targeted fine-tuning, and convergence-stability constraints. The framework is applicable to both classification and generative tasks. With only 2%–12% retraining cost relative to full retraining, it reduces adaptive privacy attack accuracy to near-random levels—effectively balancing efficient unlearning with strong membership privacy guarantees.

Technology Category

Application Category

📝 Abstract
Machine unlearning enables the removal of specific data from ML models to uphold the right to be forgotten. While approximate unlearning algorithms offer efficient alternatives to full retraining, this work reveals that they fail to adequately protect the privacy of unlearned data. In particular, these algorithms introduce implicit residuals which facilitate privacy attacks targeting at unlearned data. We observe that these residuals persist regardless of model architectures, parameters, and unlearning algorithms, exposing a new attack surface beyond conventional output-based leakage. Based on this insight, we propose the Reminiscence Attack (ReA), which amplifies the correlation between residuals and membership privacy through targeted fine-tuning processes. ReA achieves up to 1.90x and 1.12x higher accuracy than prior attacks when inferring class-wise and sample-wise membership, respectively. To mitigate such residual-induced privacy risk, we develop a dual-phase approximate unlearning framework that first eliminates deep-layer unlearned data traces and then enforces convergence stability to prevent models from "pseudo-convergence", where their outputs are similar to retrained models but still preserve unlearned residuals. Our framework works for both classification and generation tasks. Experimental evaluations confirm that our approach maintains high unlearning efficacy, while reducing the adaptive privacy attack accuracy to nearly random guess, at the computational cost of 2-12% of full retraining from scratch.
Problem

Research questions and friction points this paper is trying to address.

Exposes privacy risks in approximate machine unlearning algorithms
Proposes Reminiscence Attack to exploit unlearned data residuals
Develops dual-phase framework to mitigate residual-induced privacy risks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Exploiting residuals in approximate unlearning algorithms
Proposing Reminiscence Attack to amplify privacy risks
Developing dual-phase framework to mitigate residual risks
🔎 Similar Papers
No similar papers found.
Y
Yaxin Xiao
Department of Electronical and Electronic Engineering, The Hong Kong Polytechnic University
Qingqing Ye
Qingqing Ye
Assistant Professor, The Hong Kong Polytechnic University
data privacy and securityadversarial machine learning
L
Li Hu
Department of Electronical and Electronic Engineering, The Hong Kong Polytechnic University
Huadi Zheng
Huadi Zheng
Unknown affiliation
Voice TechnologyInformation Security
H
Haibo Hu
Department of Electronical and Electronic Engineering, The Hong Kong Polytechnic University
Zi Liang
Zi Liang
Hong Kong Polytechnic University
Natural Language ProcessingAI Security
H
Haoyang Li
Department of Electronical and Electronic Engineering, The Hong Kong Polytechnic University
Y
Yijie Jiao
Department of Electronical and Electronic Engineering, The Hong Kong Polytechnic University