🤖 AI Summary
This work addresses the performance bottleneck in federated fine-tuning of large language models (LLMs) on memory-constrained edge devices by proposing ChainFed, a chain-based federated fine-tuning paradigm. Instead of conventional end-to-end parameter updates, ChainFed sequentially trains and freezes optimized layers to construct a lightweight optimization chain. The approach introduces three core mechanisms: dynamic layer-wise co-optimization, global-aware optimization, and function-oriented adaptive tuning, which collectively reduce memory overhead while enhancing model adaptation efficacy. Experimental results demonstrate that ChainFed achieves up to a 46.46% average accuracy improvement across multiple benchmarks, significantly outperforming existing methods.
📝 Abstract
Federated fine-tuning enables privacy-preserving LLM adaptation but faces a critical bottleneck: the disparity between LLMs' high memory demands and edge devices' limited capacity. To break the memory barrier, we propose Chain Federated Fine-Tuning (ChainFed), an innovative paradigm that forgoes end-to-end updates in favor of a sequential, layer-by-layer manner. It first trains the initial adapter to convergence, freezes its weights, and then proceeds to the next. This iterative train-and-freeze process forms an optimization chain, gradually enhancing the model's task-specific proficiency. ChainFed further integrates three core techniques: 1) Dynamic Layer Co-Tuning to bridge semantic gaps between sequentially tuned layers and facilitate information flow; 2) Globally Perceptive Optimization to endow each adapter with foresight beyond its local objective; 3) Function-Oriented Adaptive Tuning to automatically identify the optimal fine-tuning starting point. Extensive experiments on multiple benchmarks demonstrate the superiority of ChainFed over existing methods, boosting average accuracy by up to 46.46\%.