🤖 AI Summary
Enabling the “right to be forgotten” in federated learning faces significant challenges: distributed training obscures data provenance, and existing exact unlearning methods rely on frequent full retraining, incurring prohibitive communication overhead and service interruption. To address this, we propose FedSGT—a novel framework that jointly introduces grouped client participation and serialized server-side PEFT (Parameter-Efficient Fine-Tuning) module training. Data are partitioned into mutually exclusive groups, each associated with a dedicated lightweight PEFT module. During unlearning, only the affected module is deactivated, enabling instantaneous, retraining-free exact unlearning. A multi-sequence design ensures long-term model stability and substantially extends the retraining-free service period. Experiments demonstrate that FedSGT maintains competitive learning performance and training efficiency while drastically reducing communication and storage overhead, enabling high-frequency unlearning requests.
📝 Abstract
Federated Learning (FL) enables collaborative, privacy-preserving model training, but supporting the "Right to be Forgotten" is especially challenging because data influences the model through distributed and interleaved client updates. Existing exact unlearning methods typically require frequent retraining from scratch, resulting in high communication cost and long service downtime. To address this, we propose Federated Sequential Group-based Training (FedSGT), an exact unlearning framework for FL. FedSGT partitions the data into uniform groups, and each client may participate in multiple groups. To control communication overhead, each client can limit the number of groups it contributes to. FedSGT then trains multiple sequences of Parameter-Efficient Fine-Tuning (PEFT) modules, each corresponding to a different group permutation. Since the PEFT modules are lightweight and maintained server-side, FedSGT isolates the influence of different data groups into independent modules without incurring significant storage overhead and communication cost. Exact unlearning is thus achieved instantly by deactivating the modules corresponding to the group containing the unlearned data. Furthermore, using multiple training sequences helps maintain high model utility as deletion requests accumulate. We provide a rigorous theoretical analysis of both the deletion rate -- expected number of deletions before retraining is needed -- and the expected model performance. Experiments on various tasks demonstrate that FedSGT achieves a significantly longer service maintenance under multiple unlearning requests while maintaining comparable learning performance and training efficiency to other exact unlearning baselines. Extensive ablation studies validate the robustness of our method across a wide range of parameter settings.