Online Learning and Unlearning

📅 2025-05-13

📈 Citations: 0

✨ Influential: 0

career value

206K/year

🤖 AI Summary

This paper studies provably secure data deletion in online learning: enabling models to continuously process streaming data while responding to arbitrary deletion requests such that post-deletion outputs are statistically indistinguishable from those of a model never trained on the deleted data. It formally defines the joint “online learning–dynamic unlearning” problem for the first time. The authors propose two algorithmic paradigms—passive and active—both built upon noisy, shrinkage-regularized online gradient descent (OGD), augmented with offline unlearning correction. Under standard convexity and smoothness assumptions, both algorithms achieve the optimal $O(sqrt{T})$ regret bound of classical OGD; the passive variant incurs zero additional computational overhead, whereas the active variant introduces controllable overhead. The key contribution is the first theoretical framework simultaneously guaranteeing rigorous unlearning (i.e., statistical indistinguishability) and optimal online learning performance.

Technology Category

Application Category

📝 Abstract

We formalize the problem of online learning-unlearning, where a model is updated sequentially in an online setting while accommodating unlearning requests between updates. After a data point is unlearned, all subsequent outputs must be statistically indistinguishable from those of a model trained without that point. We present two online learner-unlearner (OLU) algorithms, both built upon online gradient descent (OGD). The first, passive OLU, leverages OGD's contractive property and injects noise when unlearning occurs, incurring no additional computation. The second, active OLU, uses an offline unlearning algorithm that shifts the model toward a solution excluding the deleted data. Under standard convexity and smoothness assumptions, both methods achieve regret bounds comparable to those of standard OGD, demonstrating that one can maintain competitive regret bounds while providing unlearning guarantees.

Problem

Research questions and friction points this paper is trying to address.

Formalizing online learning with unlearning requests

Ensuring statistical indistinguishability after data removal

Developing algorithms with competitive regret bounds

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online gradient descent for sequential model updates

Noise injection for unlearning without extra computation

Offline unlearning to shift model post deletion

🔎 Similar Papers

Multimodal Methods for Analyzing Learning and Training Environments: A Systematic Literature Review