π€ AI Summary
This work addresses the challenge of provable data deletion in decentralized federated learning, where the influence of removed user data implicitly persists across the network due to the absence of a formal unlearning mechanism. The authors propose the first formal framework enabling provable forgetting by introducing a Newton-style correction update that leverages an approximation of the Fisher information matrix to capture loss curvature. This approach quantifies and mitigates the global impact of deleted data without requiring central coordination. By incorporating broadcasted correction updates and calibrated differential privacy noise, the method is theoretically guaranteed to produce a model statistically indistinguishable from one retrained from scratch, while preserving comparable utility. Extensive experiments demonstrate the frameworkβs effectiveness, scalability, and efficiency across diverse scenarios.
π Abstract
Driven by the right to be forgotten (RTBF), machine unlearning has become an essential requirement for privacy-preserving machine learning. However, its realization in decentralized federated learning (DFL) remains largely unexplored. In DFL, clients exchange local updates only with neighbors, causing model information to propagate and mix across the network. As a result, when a client requests data deletion, its influence is implicitly embedded throughout the system, making removal difficult without centralized coordination. We propose a novel certified unlearning framework for DFL based on Newton-style updates. Our approach first quantifies how a client's data influence propagates during training. Leveraging curvature information of the loss with respect to the target data, we then construct corrective updates using Newton-style approximations. To ensure scalability, we approximate second-order information via Fisher information matrices. The resulting updates are perturbed with calibrated noise and broadcast through the network to eliminate residual influence across clients. We theoretically prove that our approach satisfies the formal definition of certified unlearning, ensuring that the unlearned model is difficult to distinguish from a retrained model without the deleted data. We also establish utility bounds showing that the unlearned model remains close to retraining from scratch. Extensive experiments across diverse decentralized settings demonstrate the effectiveness and efficiency of our framework.