🤖 AI Summary
This study investigates the impact of data augmentation on machine unlearning, aiming to weaken model memorization of specific training samples while preserving generalization performance on the remaining data. We systematically integrate augmentation techniques—particularly TrivialAug—into mainstream unlearning methods, including SalUn, random labeling, and fine-tuning, and evaluate them on CIFAR-10 and CIFAR-100. Our key finding is that carefully selected augmentation strategies significantly suppress residual memorization of forgotten samples without degrading accuracy on retained data. Empirically, TrivialAug reduces the average unlearning gap by up to 40.12%, substantially narrowing the performance gap with full retraining baselines. To our knowledge, this work is the first to reveal the dual role of data augmentation in machine unlearning: as both a *memorization suppressor* and a *privacy–utility balancer*. It establishes a new paradigm for efficient, low-overhead privacy-preserving model updates.
📝 Abstract
Machine Unlearning (MU) aims to remove the influence of specific data from a trained model while preserving its performance on the remaining data. Although a few works suggest connections between memorisation and augmentation, the role of systematic augmentation design in MU remains under-investigated. In this work, we investigate the impact of different data augmentation strategies on the performance of unlearning methods, including SalUn, Random Label, and Fine-Tuning. Experiments conducted on CIFAR-10 and CIFAR-100, under varying forget rates, show that proper augmentation design can significantly improve unlearning effectiveness, reducing the performance gap to retrained models. Results showed a reduction of up to 40.12% of the Average Gap unlearning Metric, when using TrivialAug augmentation. Our results suggest that augmentation not only helps reduce memorization but also plays a crucial role in achieving privacy-preserving and efficient unlearning.