🤖 AI Summary
To address the limitations of conventional prompt-tuning methods in federated learning—including rigid prompts, high communication and computational overhead, and the difficulty of balancing privacy preservation with model performance—this paper proposes FedPrompt, an adaptive NLP framework based on dynamic prompt generation and lightweight fine-tuning. FedPrompt introduces a learnable dynamic prompt generator that produces context-aware prompts in real time conditioned on input instances. Within the federated architecture, it freezes the pre-trained language model (PLM) backbone and transmits only compact, task-specific prompt parameters; additionally, it incorporates a local adaptation mechanism to enhance client-level personalization. Extensive experiments across three NLP benchmarks demonstrate that FedPrompt substantially outperforms existing efficient fine-tuning approaches: it reduces communication costs by 42.6%, decreases local training time by 38.1%, and improves global model accuracy by 1.9–3.4 percentage points—all while maintaining strong privacy guarantees.
📝 Abstract
Pre-trained Language Models (PLMs) have demonstrated impressive performance in various NLP tasks. However, traditional fine-tuning methods for leveraging PLMs for downstream tasks entail significant computational overhead. Prompt-tuning has emerged as an efficient alternative that involves prepending a limited number of parameters to the input sequence and only updating them while the PLM's parameters are frozen. However, this technique's prompts remain fixed for all inputs, reducing the model's flexibility. The Federated Learning (FL) technique has gained attention in recent years to address the growing concerns around data privacy. However, challenges such as communication and computation limitations of clients still need to be addressed. To mitigate these challenges, this paper introduces the Federated Dynamic Prompt Generator (FedDPG), which incorporates a dynamic prompt generator network to generate context-aware prompts based on the given input, ensuring flexibility and adaptability while prioritising data privacy in federated learning settings. Our experiments on three NLP benchmark datasets showcase that FedDPG outperforms the state-of-the-art parameter-efficient fine-tuning methods in terms of global model performance, and has significantly reduced the calculation time and the number of parameters to be sent through the FL network.