When to Forget? Complexity Trade-offs in Machine Unlearning

📅 2025-02-24

📈 Citations: 0

✨ Influential: 0

career value

222K/year

🤖 AI Summary

This work investigates the computational complexity of machine unlearning—the efficient removal of a target sample’s influence from a trained model without access to that sample. Leveraging tools from strongly convex optimization and minimax analysis, we establish the first tight upper and lower bounds on the computational time required for unlearning. We introduce the *unlearning complexity ratio*, a novel metric quantifying the cost advantage of unlearning algorithms over full retraining. Furthermore, we construct a three-region phase diagram characterizing the critical interplay among data dimensionality, number of samples to be forgotten, and privacy constraints in determining unlearnability. Our theoretical analysis reveals that, in a moderate-difficulty regime, optimal unlearning algorithms achieve speedups of several-fold over retraining—providing both a foundational theoretical framework and a practical feasibility criterion for efficient, privacy-compliant model updates.

Technology Category

Application Category

📝 Abstract

Machine Unlearning (MU) aims at removing the influence of specific data points from a trained model, striving to achieve this at a fraction of the cost of full model retraining. In this paper, we analyze the efficiency of unlearning methods and establish the first upper and lower bounds on minimax computation times for this problem, characterizing the performance of the most efficient algorithm against the most difficult objective function. Specifically, for strongly convex objective functions and under the assumption that the forget data is inaccessible to the unlearning method, we provide a phase diagram for the unlearning complexity ratio -- a novel metric that compares the computational cost of the best unlearning method to full model retraining. The phase diagram reveals three distinct regimes: one where unlearning at a reduced cost is infeasible, another where unlearning is trivial because adding noise suffices, and a third where unlearning achieves significant computational advantages over retraining. These findings highlight the critical role of factors such as data dimensionality, the number of samples to forget, and privacy constraints in determining the practical feasibility of unlearning.

Problem

Research questions and friction points this paper is trying to address.

Optimizing Machine Unlearning efficiency

Establishing computational complexity bounds

Analyzing unlearning versus retraining costs

Innovation

Methods, ideas, or system contributions that make the work stand out.

Establishes unlearning computation bounds

Introduces unlearning complexity ratio

Identifies three distinct unlearning regimes

🔎 Similar Papers

A Unified Framework for Continual Learning and Unlearning