Retain-Neutral Surrogates for Min-Max Unlearning

📅 2026-05-07
📈 Citations: 0
Influential: 0
📄 PDF

career value

222K/year
🤖 AI Summary
Machine unlearning faces the challenge of removing the influence of specified data without degrading performance on retained data, particularly when forgetting and retain gradients are highly aligned, which can inadvertently increase retain loss. This work proposes ROSU, a novel method that introduces, for the first time, a retain-neutral perturbation constraint. Under a fixed perturbation budget, ROSU generates perturbations orthogonal to the retain gradient via a closed-form solution, ensuring zero first-order change in retain loss while amplifying unlearning efficacy along this neutral direction. Theoretical analysis establishes a curvature-based bound on second-order retain loss, demonstrating that ROSU strictly outperforms standard min-max perturbations in high-gradient-alignment regimes. Experiments show that ROSU significantly improves retain performance in highly coupled scenarios across benchmarks including CIFAR-10/100, Tiny-ImageNet, TOFU, and WMDP, while remaining competitive in other settings.
📝 Abstract
Machine unlearning seeks to remove the influence of designated training data while preserving performance on the remaining data. Approximate unlearning can be viewed as a local editing problem; in min-max unlearning, the key local object is the surrogate point at which the retain objective is evaluated. When forget and retain gradients are strongly aligned, an unconstrained forget-maximizing perturbation can move to a surrogate point that increases retain loss. We propose Retain-Orthogonal Surrogate Unlearning (ROSU), which constrains the inner surrogate construction by maximizing first-order forget gain subject to zero first-order retain change under a fixed perturbation budget. This yields a closed-form retain-orthogonal perturbation, a lightweight transported outer update, and amplification along the retain-neutral direction. Our analysis establishes (i) a curvature-controlled second-order bound on retain damage, (ii) a positive-alignment regime in which ROSU strictly reduces surrogate retain loss relative to standard min-max perturbations, and (iii) near-equivalence when the two gradients are nearly orthogonal. Across vision and language benchmarks (CIFAR-10/100, Tiny-ImageNet, TOFU, WMDP), the empirical pattern follows this geometry: ROSU gives its clearest gains in high-coupling regimes while remaining competitive elsewhere.
Problem

Research questions and friction points this paper is trying to address.

machine unlearning
min-max unlearning
retain loss
gradient alignment
surrogate point
Innovation

Methods, ideas, or system contributions that make the work stand out.

machine unlearning
retain-orthogonal perturbation
min-max unlearning
surrogate point
gradient alignment
🔎 Similar Papers
No similar papers found.