🤖 AI Summary
This work proposes the first method enabling exact continual unlearning in federated learning scenarios where a trainable ridge regression head is attached to a frozen foundation model. By leveraging the closed-form solution structure of ridge regression, the approach maintains two additive sufficient statistics and employs a fixed-size message communication protocol, supporting arbitrary sequences of client or sample insertions and deletions. The server-side model is guaranteed to be numerically identical to the model retrained centrally from scratch, with a relative Frobenius error below $10^{-9}$. The method is deterministic and invariant to the order and partitioning of operations. Furthermore, it provides a formal unlearning certificate with zero KL divergence under a Bayesian interpretation. Extensive experiments on four benchmark datasets demonstrate its efficiency and numerical precision.
📝 Abstract
Foundation models are commonly deployed as frozen feature extractors with a small trainable head to adapt to private, user-generated data in federated settings. The ``right to be forgotten'' requires removing the influence of specific samples or users from the trained model on demand. Existing federated unlearning methods target general deep models and rely on approximate reconstruction or selective retraining, making exactness costly or elusive. We study this problem in a practically relevant but under-explored regime: a frozen foundation model with a ridge-regression head. The exact optimum depends on the data only through two additive sufficient statistics, which we turn into a communication protocol supporting an arbitrary stream of \emph{add} and \emph{delete} requests via fixed-size messages. The server maintains a head that is, in exact arithmetic, \emph{pointwise identical} to centralized retraining after every request. We provide deterministic retrain-equivalence guarantees, order and partition invariance, two server-side variants, and a Bayesian certificate of zero KL divergence. Experiments on four benchmarks confirm the guarantees: both variants match centralized ridge retraining to within $10^{-9}$ relative Frobenius error and complete each request at orders-of-