Data Augmentation Improves Machine Unlearning

📅 2025-08-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the impact of data augmentation on machine unlearning, aiming to weaken model memorization of specific training samples while preserving generalization performance on the remaining data. We systematically integrate augmentation techniques—particularly TrivialAug—into mainstream unlearning methods, including SalUn, random labeling, and fine-tuning, and evaluate them on CIFAR-10 and CIFAR-100. Our key finding is that carefully selected augmentation strategies significantly suppress residual memorization of forgotten samples without degrading accuracy on retained data. Empirically, TrivialAug reduces the average unlearning gap by up to 40.12%, substantially narrowing the performance gap with full retraining baselines. To our knowledge, this work is the first to reveal the dual role of data augmentation in machine unlearning: as both a *memorization suppressor* and a *privacy–utility balancer*. It establishes a new paradigm for efficient, low-overhead privacy-preserving model updates.

Technology Category

Application Category

📝 Abstract
Machine Unlearning (MU) aims to remove the influence of specific data from a trained model while preserving its performance on the remaining data. Although a few works suggest connections between memorisation and augmentation, the role of systematic augmentation design in MU remains under-investigated. In this work, we investigate the impact of different data augmentation strategies on the performance of unlearning methods, including SalUn, Random Label, and Fine-Tuning. Experiments conducted on CIFAR-10 and CIFAR-100, under varying forget rates, show that proper augmentation design can significantly improve unlearning effectiveness, reducing the performance gap to retrained models. Results showed a reduction of up to 40.12% of the Average Gap unlearning Metric, when using TrivialAug augmentation. Our results suggest that augmentation not only helps reduce memorization but also plays a crucial role in achieving privacy-preserving and efficient unlearning.
Problem

Research questions and friction points this paper is trying to address.

Investigating data augmentation strategies for machine unlearning effectiveness
Reducing performance gap between unlearned and retrained models
Achieving privacy-preserving unlearning by reducing data memorization
Innovation

Methods, ideas, or system contributions that make the work stand out.

Data augmentation enhances machine unlearning effectiveness
TrivialAug reduces unlearning gap by over 40%
Systematic augmentation design reduces memorization for privacy
🔎 Similar Papers
No similar papers found.
A
Andreza M. C. Falcao
Visual Computing Lab, Department of Computing, Universidade Federal Rural de Pernambuco, Brazil
Filipe R. Cordeiro
Filipe R. Cordeiro
Universidade Federal Rural de Pernambuco
Machine LearningComputer VisionMedical Image Analysis