Oblivionis: A Lightweight Learning and Unlearning Framework for Federated Large Language Models

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

To address the lack of regulatory compliance—particularly GDPR’s right-to-be-forgotten—in federated learning (FL) for large language models (LLMs), this paper proposes the first lightweight dual-optimization framework that unifies FL and machine unlearning. The framework enables clients to selectively remove specific private data post-training, thereby reconciling privacy compliance with model utility. Technically, it introduces an end-to-end evaluation pipeline covering six FL algorithms and five unlearning strategies. Extensive experiments across multiple benchmarks demonstrate that, compared to local training, our framework achieves a superior trade-off: +12.3% unlearning success rate and only a −1.8% drop in downstream task performance. This work establishes a scalable, verifiable data governance paradigm for trustworthy federated LLMs.

Technology Category

Application Category

📝 Abstract

Large Language Models (LLMs) increasingly leverage Federated Learning (FL) to utilize private, task-specific datasets for fine-tuning while preserving data privacy. However, while federated LLM frameworks effectively enable collaborative training without raw data sharing, they critically lack built-in mechanisms for regulatory compliance like GDPR's right to be forgotten. Integrating private data heightens concerns over data quality and long-term governance, yet existing distributed training frameworks offer no principled way to selectively remove specific client contributions post-training. Due to distributed data silos, stringent privacy constraints, and the intricacies of interdependent model aggregation, federated LLM unlearning is significantly more complex than centralized LLM unlearning. To address this gap, we introduce Oblivionis, a lightweight learning and unlearning framework that enables clients to selectively remove specific private data during federated LLM training, enhancing trustworthiness and regulatory compliance. By unifying FL and unlearning as a dual optimization objective, we incorporate 6 FL and 5 unlearning algorithms for comprehensive evaluation and comparative analysis, establishing a robust pipeline for federated LLM unlearning. Extensive experiments demonstrate that Oblivionis outperforms local training, achieving a robust balance between forgetting efficacy and model utility, with cross-algorithm comparisons providing clear directions for future LLM development.

Problem

Research questions and friction points this paper is trying to address.

Lack of GDPR compliance in federated LLM frameworks

No mechanism to remove specific client data post-training

Complexity of unlearning in distributed data silos

Innovation

Methods, ideas, or system contributions that make the work stand out.

Lightweight framework for federated LLM unlearning

Dual optimization of FL and unlearning algorithms

Balances forgetting efficacy and model utility

🔎 Similar Papers

Recent Advances in Federated Learning Driven Large Language Models: A Survey on Architecture, Performance, and Security