Provable Unlearning with Gradient Ascent on Two-Layer ReLU Neural Networks

📅 2025-10-16

📈 Citations: 0

✨ Influential: 0

career value

164K/year

🤖 AI Summary

This paper addresses machine unlearning: how to verifiably erase the influence of specific data points on a pre-trained model without full retraining. For linear models and two-layer ReLU networks, we propose a novel $(epsilon,delta, au)$-successful unlearning criterion—first linking unlearning efficacy to the satisfaction of KKT conditions on retained data. Our method employs gradient ascent to invertibly cancel the target sample’s gradient contribution. We theoretically prove that, under appropriate scaling, the approach strictly satisfies the unlearning criterion. Leveraging high-dimensional statistical analysis and implicit bias theory, we further guarantee that the unlearned model maintains strong generalization performance on retained data. Experiments on Gaussian mixture data validate both the effectiveness and computational efficiency of our method.

Technology Category

Application Category

📝 Abstract

Machine Unlearning aims to remove specific data from trained models, addressing growing privacy and ethical concerns. We provide a theoretical analysis of a simple and widely used method - gradient ascent - used to reverse the influence of a specific data point without retraining from scratch. Leveraging the implicit bias of gradient descent towards solutions that satisfy the Karush-Kuhn-Tucker (KKT) conditions of a margin maximization problem, we quantify the quality of the unlearned model by evaluating how well it satisfies these conditions w.r.t. the retained data. To formalize this idea, we propose a new success criterion, termed extbf{$(epsilon, delta, au)$-successful} unlearning, and show that, for both linear models and two-layer neural networks with high dimensional data, a properly scaled gradient-ascent step satisfies this criterion and yields a model that closely approximates the retrained solution on the retained data. We also show that gradient ascent performs successful unlearning while still preserving generalization in a synthetic Gaussian-mixture setting.

Problem

Research questions and friction points this paper is trying to address.

Theoretical analysis of gradient ascent for machine unlearning in neural networks

Quantifying unlearning quality via KKT conditions on retained data

Proving gradient ascent approximates retrained models while preserving generalization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient ascent for data point removal

KKT-based unlearning success criterion

Approximation of retrained model solution

🔎 Similar Papers

Unified Neural Backdoor Removal with Only Few Clean Samples through Unlearning and Relearning