🤖 AI Summary
This work proposes a collaborative fine-tuning framework tailored for wireless environments to address key challenges in decentralized federated learning, including catastrophic forgetting due to data heterogeneity, high communication overhead, and interference among multiple tasks. The approach integrates sparse orthogonal LoRA adapters to mitigate conflicting parameter updates across tasks, employs a clustering-based dynamic topology aggregation strategy to enhance communication efficiency, and incorporates an implicit mixture-of-experts (MoE) mechanism to alleviate knowledge interference during inference. Experimental results demonstrate that the proposed method reduces communication costs by up to 73% compared to conventional LoRA-based approaches, achieves an average performance gain of 5%, and significantly improves both training stability and multi-task inference capability.
📝 Abstract
Decentralized federated learning (DFL) based on low-rank adaptation (LoRA) enables mobile devices with multi-task datasets to collaboratively fine-tune a large language model (LLM) by exchanging locally updated parameters with a subset of neighboring devices via wireless connections for knowledge integration.However, directly aggregating parameters fine-tuned on heterogeneous datasets induces three primary issues across the DFL life-cycle: (i) \textit{catastrophic knowledge forgetting during fine-tuning process}, arising from conflicting update directions caused by data heterogeneity; (ii) \textit{inefficient communication and convergence during model aggregation process}, due to bandwidth-intensive redundant model transmissions; and (iii) \textit{multi-task knowledge interference during inference process}, resulting from incompatible knowledge representations coexistence during inference. To address these issues in a fully decentralized scenario, we first propose a sparse-and-orthogonal LoRA that ensures orthogonality between model updates to eliminate direction conflicts during fine-tuning.Then, we analyze how device connection topology affects multi-task performance, prompting a cluster-based topology design during aggregation.Finally, we propose an implicit mixture of experts (MoE) mechanism to avoid the coexistence of incompatible knowledge during inference. Simulation results demonstrate that the proposed approach effectively reduces communication resource consumption by up to $73\%$ and enhances average performance by $5\%$ compared with the traditional LoRA method.