🤖 AI Summary
This work addresses the issue of client drift in federated learning, which arises from inconsistencies among clients' local optima and degrades model generalization. The authors propose FedInit, a personalized relaxed initialization strategy that, at the start of each local training round, adjusts local parameters away from the current global state in the opposite direction of their most recent local update, thereby effectively mitigating drift. Through an excess risk analysis, the study is the first to reveal that such local inconsistency primarily affects generalization error rather than optimization error, enabling the design of an initialization mechanism that incurs no additional communication or computational overhead. Theoretical bounds on generalization error are derived, and experiments demonstrate that FedInit achieves or surpasses the generalization performance of state-of-the-art methods at zero extra cost, while seamlessly integrating into existing federated algorithms to further enhance their efficacy.
📝 Abstract
Federated learning (FL) is a distributed paradigm that coordinates massive local clients to collaboratively train a global model via stage-wise local training processes on the heterogeneous dataset. Previous works have implicitly studied that FL suffers from the ``client-drift'' problem, which is caused by the inconsistent optimum across local clients. However, till now it still lacks solid theoretical analysis to explain the impact of this local inconsistency. To alleviate the negative impact of ``client drift'' and explore its substance in FL, in this paper, we first propose an efficient FL algorithm FedInit, which allows employing the personalized relaxed initialization state at the beginning of each local training stage. Specifically, FedInit initializes the local state by moving away from the current global state towards the reverse direction of the latest local state. Moreover, to further understand how inconsistency disrupts performance in FL, we introduce the excess risk analysis and study the divergence term to investigate the test error in FL. Our studies show that optimization error is not sensitive to this local inconsistency, while it mainly affects the generalization error bound. Extensive experiments are conducted to validate its efficiency. The proposed FedInit method could achieve comparable results compared to several advanced benchmarks without any additional training or communication costs. Meanwhile, the stage-wise personalized relaxed initialization could also be incorporated into several current advanced algorithms to achieve higher generalization performance in the FL paradigm.