Beyond Parameter Aggregation: Semantic Consensus for Federated Fine-Tuning of LLMs

📅 2026-05-12

📈 Citations: 0

✨ Influential: 0

career value

235K/year

🤖 AI Summary

This work addresses the limitations of traditional federated fine-tuning—such as high communication overhead, strict model architecture alignment, and reliance on white-box access—which hinder its applicability to large language models and heterogeneous deployment settings. The authors propose a novel semantic consensus–based federated fine-tuning paradigm: clients fine-tune local models on private data and exchange generated text on a shared public prompt set; the server maps these outputs into a semantic space, constructs per-prompt semantic consensus representations, and returns pseudo-labels for local retraining. By eliminating parameter aggregation and transmitting only lightweight generative behaviors, this approach decouples communication complexity from model scale and inherently supports heterogeneous architectures and open-ended generation. Experiments demonstrate that it matches strong baselines in performance while reducing communication costs by up to 1,006× (e.g., with Llama3.1-405B), substantially cutting runtime and energy consumption.

📝 Abstract

Federated fine-tuning of large language models is commonly formulated as a parameter aggregation problem. However, even parameter-efficient methods require transmitting large collections of trainable weights, assume aligned architectures, and rely on white-box access to model parameters. As model sizes continue to grow and deployments become increasingly heterogeneous, these assumptions become progressively misaligned with practical constraints. We consider an alternative formulation in which collaboration is mediated through model behavior rather than parameters. Clients fine-tune local models on private data and exchange generated outputs on a shared, public prompt set. The server maps these outputs into a semantic representation space, forms a per-prompt semantic consensus, and returns pseudo-labels for further local fine-tuning. This formulation fundamentally changes the communication scaling of federated LLM fine-tuning. The amount of information exchanged depends only on the public prompt budget and the size of the communicated behaviors, independent of model size. As a consequence, the protocol naturally accommodates heterogeneous architectures and applies directly to open-ended text generation. We present a theoretical analysis and empirical results demonstrating that this approach can match strong federated fine-tuning baselines while substantially reducing communication by orders of magnitude (e.g., analytically by a factor of $1006$ for Llama3.1-405B), as well as reductions in runtime and energy consumption. These results suggest that, for generative foundation models, behavior-level consensus provides a more appropriate abstraction for federated adaptation than parameter aggregation.

Problem

Research questions and friction points this paper is trying to address.

federated fine-tuning

large language models

parameter aggregation

heterogeneous architectures

communication efficiency

Innovation

Methods, ideas, or system contributions that make the work stand out.

semantic consensus

federated fine-tuning

large language models