Descend or Rewind? Stochastic Gradient Descent Unlearning

📅 2025-11-19

📈 Citations: 0

✨ Influential: 0

career value

151K/year

🤖 AI Summary

This work addresses the machine unlearning problem—efficiently eliminating the influence of specific training samples on a model without full retraining. We propose a perturbation-based gradient system framework and establish, for the first time, unified $(varepsilon,delta)$-certified unlearning guarantees for both R2D and D2D unlearning algorithms under strongly convex, convex, and non-convex loss functions. By introducing a relaxed Gaussian mechanism and optimal coupling analysis, we characterize the unlearning behavior of these algorithms across convergence regimes: D2D achieves tighter unlearning bounds under strong convexity, whereas R2D leverages reverse perturbation to attain effective unlearning in convex and non-convex settings. Our approach combines theoretical rigor with practical simplicity, significantly reducing computational overhead for unlearning while preserving model utility.

Technology Category

Application Category

📝 Abstract

Machine unlearning algorithms aim to remove the impact of selected training data from a model without the computational expenses of retraining from scratch. Two such algorithms are ``Descent-to-Delete" (D2D) and ``Rewind-to-Delete" (R2D), full-batch gradient descent algorithms that are easy to implement and satisfy provable unlearning guarantees. In particular, the stochastic version of D2D is widely implemented as the ``finetuning" unlearning baseline, despite lacking theoretical backing on nonconvex functions. In this work, we prove $(ε, δ)$ certified unlearning guarantees for stochastic R2D and D2D for strongly convex, convex, and nonconvex loss functions, by analyzing unlearning through the lens of disturbed or biased gradient systems, which may be contracting, semi-contracting, or expansive respectively. Our argument relies on optimally coupling the random behavior of the unlearning and retraining trajectories, resulting in a probabilistic sensitivity bound that can be combined with a novel relaxed Gaussian mechanism to achieve $(ε, δ)$ unlearning. We determine that D2D can yield tighter guarantees for strongly convex functions compared to R2D by relying on contraction to a unique global minimum. However, unlike D2D, R2D can achieve unlearning in the convex and nonconvex setting because it draws the unlearned model closer to the retrained model by reversing the accumulated disturbances.

Problem

Research questions and friction points this paper is trying to address.

Providing certified unlearning guarantees for stochastic gradient descent algorithms

Analyzing unlearning through disturbed or biased gradient systems behavior

Comparing D2D and R2D algorithms across convex and nonconvex functions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Stochastic gradient descent unlearning with provable guarantees

Analyzing unlearning through disturbed gradient systems lens

Optimal coupling of unlearning and retraining trajectories

🔎 Similar Papers

Towards Effective Evaluations and Comparisons for LLM Unlearning Methods