🤖 AI Summary
This study investigates the impact of deep neural network (DNN) over-parameterization on machine unlearning—i.e., the efficient removal of model memory associated with specific training samples without full retraining, to simultaneously address privacy preservation and bias mitigation. Through systematic experiments integrating state-of-the-art unlearning algorithms, validation-set-driven hyperparameter tuning, local decision boundary sensitivity analysis, and error attribution, we establish for the first time that over-parameterization markedly enhances a model’s local decision controllability in the neighborhood of forgotten samples, enabling high-fidelity unlearning. Crucially, this improvement occurs without degrading global generalization performance, while simultaneously strengthening correction of biases induced by the forgotten samples. Our core contribution is the identification of over-parameterization as a critical structural prior that jointly improves unlearning efficiency and fairness.
📝 Abstract
Machine unlearning is the task of updating a trained model to forget specific training data without retraining from scratch. In this paper, we investigate how unlearning of deep neural networks (DNNs) is affected by the model parameterization level, which corresponds here to the DNN width. We define validation-based tuning for several unlearning methods from the recent literature, and show how these methods perform differently depending on (i) the DNN parameterization level, (ii) the unlearning goal (unlearned data privacy or bias removal), (iii) whether the unlearning method explicitly uses the unlearned examples. Our results show that unlearning excels on overparameterized models, in terms of balancing between generalization and achieving the unlearning goal; although for bias removal this requires the unlearning method to use the unlearned examples. We further elucidate our error-based analysis by measuring how much the unlearning changes the classification decision regions in the proximity of the unlearned examples, and avoids changing them elsewhere. By this we show that the unlearning success for overparameterized models stems from the ability to delicately change the model functionality in small regions in the input space while keeping much of the model functionality unchanged.