🤖 AI Summary
The impact mechanisms of context length and quality on large language model (LLM) fine-tuning performance in heterogeneous federated learning remain poorly understood.
Method: We propose CLLoRA—a parameter-efficient federated fine-tuning framework based on Low-Rank Adaptation (LoRA)—designed to enable systematic evaluation of multi-scale LLMs under non-IID contextual settings.
Contribution/Results: We empirically identify context quality imbalance—not context length—as the primary driver of both local degradation and global performance decline; context length only significantly affects global convergence. Building on this insight, we introduce a quantifiable dual-dimensional metric jointly assessing context quality and length. Extensive experiments across diverse LLMs (e.g., Llama-2, Phi-3) and heterogeneous datasets validate the robustness and reproducibility of our findings. CLLoRA establishes a novel evaluation paradigm for privacy-preserving, efficient federated fine-tuning and provides actionable guidelines for practical deployment.
📝 Abstract
Large language model fine-tuning has been identified as an efficient approach to applying the pre-trained Large language models to other domains. To guarantee data privacy for different data owners, models are often fine-tuned in federated learning environments across different data owners, which often involve data heterogeneity issues and affect the fine-tuning performance. In addition, the length of the context for the training data has been identified as a major factor that affects the LLM's model performance. To efficiently measure how the context length affects the LLM's model performance in heterogeneous federated learning environments, we propose CLLoRA. CLLoRA utilizes the parameter-efficient fine-tuning approach LoRA based on different kinds of LLMs with varying sizes as the fine-tuning approach to investigate whether the quality and length of contexts can serve as standards for measuring non-IID context. The findings indicate that an imbalance in context quality not only affects local training on clients but also impacts the global model's performance. However, context length has a minimal effect on local training but a more significant influence on the global model. These results provide insights into how context quality and length affect the model performance for LLM fine-tuning in federated learning environments.