🤖 AI Summary
In federated learning (FL), severe heterogeneity across clients—encompassing data distributions, model capabilities, and computational resources—leads to semantic misalignment during representation fine-tuning (ReFT) aggregation. To address this, we propose FedReFT: a lightweight framework that performs sparse intervention in hidden-layer representations for efficient client-specific adaptation, coupled with an *All-But-Me* aggregation strategy that dynamically excludes semantically conflicting client updates to balance personalization and global consistency. Crucially, FedReFT requires no modification of original model parameters; instead, it achieves task adaptation via low-rank representation perturbations. Extensive experiments on commonsense reasoning, arithmetic reasoning, instruction tuning, and GLUE benchmarks demonstrate that FedReFT consistently outperforms mainstream parameter-efficient fine-tuning (PEFT) methods—including LoRA—achieving 7–15× higher parameter efficiency. Moreover, its minimal memory and compute footprint enables deployment on resource-constrained edge devices.
📝 Abstract
Parameter-efficient fine-tuning (PEFT) has attracted significant attention for adapting large pre-trained models by modifying a small subset of parameters. Recently, Representation Fine-tuning (ReFT) has emerged as an effective alternative. ReFT shifts the fine-tuning paradigm from updating model weights to directly manipulating hidden representations that capture rich semantic information, and performs better than state-of-the-art PEFTs in standalone settings. However, its application in Federated Learning (FL) remains challenging due to heterogeneity in clients' data distributions, model capacities, and computational resources. To address these challenges, we introduce Federated Representation Fine-Tuning (FedReFT), a novel approach to fine-tune the client's hidden representation. FedReFT applies sparse intervention layers to steer hidden representations directly, offering a lightweight and semantically rich fine-tuning alternative ideal for edge devices. However, representation-level updates are especially vulnerable to aggregation mismatch under different task heterogeneity, where naive averaging can corrupt semantic alignment. To mitigate this issue, we propose All-But-Me (ABM) aggregation, where each client receives the aggregated updates of others and partially incorporates them, enabling stable and personalized learning by balancing local focus with global knowledge. We evaluate FedReFT on commonsense reasoning, arithmetic reasoning, instruction-tuning, and GLUE, where it consistently outperforms state-of-the-art PEFT methods in FL, achieving 7x-15x higher parameter efficiency compared to leading LoRA-based approaches.