Adaptive Selection of LoRA Components in Privacy-Preserving Federated Learning

📅 2026-05-07

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

In differentially private federated learning, the multiplicative structure of LoRA introduces significant errors during noisy aggregation, compromising model stability and accuracy. To address this, this work proposes AS-LoRA, a novel framework that enables dynamic selection of LoRA components at both layer and communication-round levels for the first time. AS-LoRA employs a curvature-based scoring mechanism to adaptively activate submodules and leverages second-order information to optimize the aggregation strategy—all without consuming additional privacy budget. This approach effectively eliminates the reconstruction error floor inherent in fixed scheduling, accelerates convergence, and steers optimization toward flatter minima. Experiments demonstrate that AS-LoRA substantially outperforms existing federated LoRA methods on benchmarks such as GLUE and SQuAD—e.g., achieving a 12.5 percentage point gain on MNLI-mm—while matching the performance of SVD-based methods at 33–180× lower aggregation cost.

📝 Abstract

Differentially private federated fine-tuning of large models with LoRA suffers from aggregation error caused by LoRA's multiplicative structure, which is further amplified by DP noise and degrades both stability and accuracy. Existing remedies apply a single update mode uniformly across all layers and all communication rounds (or alternate them on a fixed schedule), ignoring both the structural asymmetry between the two LoRA factors and the round-wise dynamics of training. We propose AS-LoRA, an adaptive framework defined by three axes (i) layer-wise freedom, in which each layer independently selects its active component, (ii) round-wise adaptivity, in which the selection updates over communication rounds, and (iii) a curvature-aware score derived from a second-order approximation of the loss. Theoretically, AS-LoRA eliminates the reconstruction-error floor of layer-tied schedules, accelerates convergence, implicitly biases solutions toward flatter minima, and incurs no additional privacy cost. Across GLUE, SQuAD, CIFAR-100, and Tiny-ImageNet under strict DP budgets and non-IID partitions, AS-LoRA improves over the federated LoRA baselines by up to $+7.5$ pp on GLUE and $+12.5$ pp on MNLI-mm for example, while matching or exceeding SVD-based aggregation methods at $33\text{--}180 \times$ lower aggregation cost and with negligible communication overhead. Code for the proposed method is available at https://anonymous.4open.science/r/as_lora-F75F/.

Problem

Research questions and friction points this paper is trying to address.

Federated Learning

Differential Privacy

LoRA

Aggregation Error

Non-IID

Innovation

Methods, ideas, or system contributions that make the work stand out.

Adaptive LoRA

Federated Learning

Differential Privacy