🤖 AI Summary
This work addresses the dual challenges of parameter efficiency and privacy preservation in federated learning (FL) for post-training large language models (LLMs) under black-box access—i.e., when only inference APIs are available. Methodologically, it integrates FL frameworks with parameter-efficient fine-tuning (PEFT) techniques to enable distributed, collaborative optimization of proprietary, API-hosted LLMs without exposing model parameters. The paper makes three key contributions: (1) it presents the first systematic taxonomy for black-box Federated LLMs (FedLLM), organized along two axes—model accessibility (white-box, gray-box, black-box) and parameter efficiency; (2) it establishes a unified classification framework, surveys representative approaches, and identifies core technical challenges; and (3) it clarifies the research trajectory for black-box FedLLM advancement, bridging theoretical privacy guarantees with practical deployment requirements. This work lays foundational groundwork for privacy-respecting, scalable LLM personalization in decentralized settings.
📝 Abstract
Federated Learning (FL) enables training models across decentralized data silos while preserving client data privacy. Recent research has explored efficient methods for post-training large language models (LLMs) within FL to address computational and communication challenges. While existing approaches often rely on access to LLMs' internal information, which is frequently restricted in real-world scenarios, an inference-only paradigm (black-box FedLLM) has emerged to address these limitations. This paper presents a comprehensive survey on federated tuning for LLMs. We propose a taxonomy categorizing existing studies along two axes: model access-based and parameter efficiency-based optimization. We classify FedLLM approaches into white-box, gray-box, and black-box techniques, highlighting representative methods within each category. We review emerging research treating LLMs as black-box inference APIs and discuss promising directions and open challenges for future research.