Exact Unlearning from Proxies Induces Closeness Guarantees on Approximate Unlearning

📅 2026-05-11

📈 Citations: 0

✨ Influential: 0

career value

176K/year

🤖 AI Summary

This work addresses the challenge of efficiently removing the influence of specific data in machine unlearning without incurring the high cost of full retraining. It introduces a novel, distribution-based unlearning paradigm that models the underlying data distribution structure to precisely infer unlearning signals. By incorporating a verifiable admissibility criterion, the proposed framework establishes a theoretically grounded unlearning mechanism with formal guarantees. Specifically, it provides a bounded KL divergence between the unlearned model and the ideal model obtained via complete retraining, ensuring their proximity. Empirical evaluations across three representative unlearning scenarios demonstrate that the method significantly outperforms existing approaches, substantiating both its theoretical soundness and practical superiority.

📝 Abstract

This paper proposes a paradigm shift linking machine unlearning directly to the structure of the data distributions rather than a mere update of the neural network parameters. We show that inferring these distributions with precision enables distilling the exact unlearning signal induced by the modeling. Theoretical bounds on the Kullback-Leibler divergence from the ideal retrained model to our unlearned model, under verifiable admissibility criterion, reveal the soundness of our framework. This method is experimentally validated over three forgetting scenarios as reaching the closest classifier to the ideal retrained model when compared to competitors.

Problem

Research questions and friction points this paper is trying to address.

machine unlearning

data distribution

Kullback-Leibler divergence

model retraining

forgetting

Innovation

Methods, ideas, or system contributions that make the work stand out.

machine unlearning

data distribution

Kullback-Leibler divergence