🤖 AI Summary
Machine unlearning aims to efficiently eliminate the influence of specific training samples from a pre-trained model while preserving generalization performance on the remaining data. This paper proposes OrthoGrad, the first method that orthogonally projects the gradient of the to-be-forgotten sample onto the subspace spanned by gradients of retained samples—thereby circumventing the instability inherent in conventional gradient ascent/descent adversarial optimization. OrthoGrad achieves efficient unlearning via per-sample gradient computation, subspace orthogonalization, and geometric constraints on batch gradients. Evaluated across multiple benchmarks—including speech recognition—OrthoGrad significantly improves unlearning accuracy while inducing smaller performance degradation and greater generalization stability on retained data. These results empirically validate gradient subspace orthogonalization as a novel and effective paradigm for machine unlearning.
📝 Abstract
Machine unlearning aims to remove the influence of problematic training data after a model has been trained. The primary challenge in machine unlearning is ensuring that the process effectively removes specified data without compromising the model's overall performance on the remaining dataset. Many existing machine unlearning methods address this challenge by carefully balancing gradient ascent on the unlearn data with the gradient descent on a retain set representing the training data. Here, we propose OrthoGrad, a novel approach that mitigates interference between the unlearn set and the retain set rather than competing ascent and descent processes. Our method projects the gradient of the unlearn set onto the subspace orthogonal to all gradients in the retain batch, effectively avoiding any gradient interference. We demonstrate the effectiveness of OrthoGrad on multiple machine unlearning benchmarks, including automatic speech recognition, outperforming competing methods.