Machine Unlearning for Streaming Forgetting

📅 2025-07-21

📈 Citations: 0

✨ Influential: 0

career value

191K/year

🤖 AI Summary

Existing machine unlearning methods rely on batch processing, rendering them ill-suited for real-world streaming deletion requests that arrive continuously. Method: This paper formally defines the streaming unlearning problem as online knowledge erasure under distributional shift and proposes a dynamic unlearning framework that requires neither the original training data nor convexity assumptions. The method integrates cumulative variational estimation with online learning to enable immediate, single-request responses. Contribution/Results: We provide theoretical guarantees showing an error bound of $O(sqrt{T} + V_T)$ under non-convex losses—breaking the traditional reliance on batch processing and convex loss functions. Experiments across diverse models and datasets demonstrate that our approach achieves superior forgetting accuracy and computational efficiency while preserving downstream task performance, significantly outperforming existing baselines.

Technology Category

Application Category

📝 Abstract

Machine unlearning aims to remove knowledge of the specific training data in a well-trained model. Currently, machine unlearning methods typically handle all forgetting data in a single batch, removing the corresponding knowledge all at once upon request. However, in practical scenarios, requests for data removal often arise in a streaming manner rather than in a single batch, leading to reduced efficiency and effectiveness in existing methods. Such challenges of streaming forgetting have not been the focus of much research. In this paper, to address the challenges of performance maintenance, efficiency, and data access brought about by streaming unlearning requests, we introduce a streaming unlearning paradigm, formalizing the unlearning as a distribution shift problem. We then estimate the altered distribution and propose a novel streaming unlearning algorithm to achieve efficient streaming forgetting without requiring access to the original training data. Theoretical analyses confirm an $O(sqrt{T} + V_T)$ error bound on the streaming unlearning regret, where $V_T$ represents the cumulative total variation in the optimal solution over $T$ learning rounds. This theoretical guarantee is achieved under mild conditions without the strong restriction of convex loss function. Experiments across various models and datasets validate the performance of our proposed method.

Problem

Research questions and friction points this paper is trying to address.

Addresses inefficiency in batch-based machine unlearning methods

Proposes solution for streaming data removal requests

Ensures performance without accessing original training data

Innovation

Methods, ideas, or system contributions that make the work stand out.

Streaming unlearning paradigm for data removal

Formalizes unlearning as distribution shift problem

Efficient algorithm without original training data

🔎 Similar Papers

A Unified Framework for Continual Learning and Unlearning