🤖 AI Summary
This work addresses the challenge of efficiently removing the influence of specific data in machine unlearning without incurring the high cost of full retraining. It introduces a novel, distribution-based unlearning paradigm that models the underlying data distribution structure to precisely infer unlearning signals. By incorporating a verifiable admissibility criterion, the proposed framework establishes a theoretically grounded unlearning mechanism with formal guarantees. Specifically, it provides a bounded KL divergence between the unlearned model and the ideal model obtained via complete retraining, ensuring their proximity. Empirical evaluations across three representative unlearning scenarios demonstrate that the method significantly outperforms existing approaches, substantiating both its theoretical soundness and practical superiority.
📝 Abstract
This paper proposes a paradigm shift linking machine unlearning directly to the structure of the data distributions rather than a mere update of the neural network parameters. We show that inferring these distributions with precision enables distilling the exact unlearning signal induced by the modeling. Theoretical bounds on the Kullback-Leibler divergence from the ideal retrained model to our unlearned model, under verifiable admissibility criterion, reveal the soundness of our framework. This method is experimentally validated over three forgetting scenarios as reaching the closest classifier to the ideal retrained model when compared to competitors.