🤖 AI Summary
Federated learning (FL) suffers from degraded generalization performance due to data heterogeneity, which induces inconsistent local optima across clients; yet existing analyses predominantly focus on either convergence or stability in isolation, failing to characterize the intrinsic mechanisms behind generalization deterioration in neural networks.
Method: We propose Libra—the first FL generalization dynamics framework tailored for neural networks—unifying algorithmic stability and optimization dynamics to jointly quantify their trade-off in excess risk evolution.
Contribution/Results: We theoretically establish that increasing local update steps or server-side momentum degrades stability but reduces the minimal achievable excess risk. Libra transcends conventional analytical limitations by offering an interpretable, quantifiable theoretical foundation for designing FL algorithms with enhanced generalization capability—bridging the gap between stability, optimization behavior, and generalization in heterogeneous settings.
📝 Abstract
Federated Learning (FL) is a distributed learning approach that trains machine learning models across multiple devices while keeping their local data private. However, FL often faces challenges due to data heterogeneity, leading to inconsistent local optima among clients. These inconsistencies can cause unfavorable convergence behavior and generalization performance degradation. Existing studies mainly describe this issue through extit{convergence analysis}, focusing on how well a model fits training data, or through extit{algorithmic stability}, which examines the generalization gap. However, neither approach precisely captures the generalization performance of FL algorithms, especially for neural networks. This paper introduces an innovative generalization dynamics analysis framework, named as Libra, for algorithm-dependent excess risk minimization, highlighting the trade-offs between model stability and optimization. Through this framework, we show how the generalization of FL algorithms is affected by the interplay of algorithmic stability and optimization. This framework applies to standard federated optimization and its advanced variants, such as server momentum. Our findings suggest that larger local steps or momentum accelerate convergence but enlarge stability, while yielding a better minimum excess risk. These insights can guide the design of future algorithms to achieve stronger generalization.