🤖 AI Summary
This paper studies provably secure data deletion in online learning: enabling models to continuously process streaming data while responding to arbitrary deletion requests such that post-deletion outputs are statistically indistinguishable from those of a model never trained on the deleted data. It formally defines the joint “online learning–dynamic unlearning” problem for the first time. The authors propose two algorithmic paradigms—passive and active—both built upon noisy, shrinkage-regularized online gradient descent (OGD), augmented with offline unlearning correction. Under standard convexity and smoothness assumptions, both algorithms achieve the optimal $O(sqrt{T})$ regret bound of classical OGD; the passive variant incurs zero additional computational overhead, whereas the active variant introduces controllable overhead. The key contribution is the first theoretical framework simultaneously guaranteeing rigorous unlearning (i.e., statistical indistinguishability) and optimal online learning performance.
📝 Abstract
We formalize the problem of online learning-unlearning, where a model is updated sequentially in an online setting while accommodating unlearning requests between updates. After a data point is unlearned, all subsequent outputs must be statistically indistinguishable from those of a model trained without that point. We present two online learner-unlearner (OLU) algorithms, both built upon online gradient descent (OGD). The first, passive OLU, leverages OGD's contractive property and injects noise when unlearning occurs, incurring no additional computation. The second, active OLU, uses an offline unlearning algorithm that shifts the model toward a solution excluding the deleted data. Under standard convexity and smoothness assumptions, both methods achieve regret bounds comparable to those of standard OGD, demonstrating that one can maintain competitive regret bounds while providing unlearning guarantees.