🤖 AI Summary
Existing machine unlearning methods adopt a privacy-driven binary data deletion paradigm, leading to “over-unlearning”—i.e., unintended removal of non-target samples—which degrades model fairness and robustness.
Method: This paper introduces counterfactual leave-one-out analysis to attribute the root causes of over-unlearning and proposes a soft-weighted unlearning framework. By analytically solving a convex quadratic program, it constructs sample-level differentiable weights, enabling fine-grained, continuous, and generalizable model correction. Unlike conventional binary deletion, this framework incorporates an analytically tractable weighted influence function and supports end-to-end fine-tuning.
Contribution/Results: Experiments demonstrate significant improvements in fairness and robustness metrics, mitigated accuracy degradation, and seamless compatibility with mainstream unlearning algorithms. The framework establishes a general-purpose, theoretically grounded correction mechanism with strong generalization capability across diverse unlearning settings.
📝 Abstract
Machine unlearning, as a post-hoc processing technique, has gained widespread adoption in addressing challenges like bias mitigation and robustness enhancement, colloquially, machine unlearning for fairness and robustness. However, existing non-privacy unlearning-based solutions persist in using binary data removal framework designed for privacy-driven motivation, leading to significant information loss, a phenomenon known as over-unlearning. While over-unlearning has been largely described in many studies as primarily causing utility degradation, we investigate its fundamental causes and provide deeper insights in this work through counterfactual leave-one-out analysis. In this paper, we introduce a weighted influence function that assigns tailored weights to each sample by solving a convex quadratic programming problem analytically. Building on this, we propose a soft-weighted framework enabling fine-grained model adjustments to address the over-unlearning challenge. We demonstrate that the proposed soft-weighted scheme is versatile and can be seamlessly integrated into most existing unlearning algorithms. Extensive experiments show that in fairness- and robustness-driven tasks, the soft-weighted scheme significantly outperforms hard-weighted schemes in fairness/robustness metrics and alleviates the decline in utility metric, thereby enhancing machine unlearning algorithm as an effective correction solution.