🤖 AI Summary
This work addresses the issue of global model drift in federated learning caused by imbalanced client data under non-IID settings. To mitigate this, the authors propose a client-side sample balancing mechanism that leverages on-device generative models for knowledge infusion and resampling, achieving data equilibrium under a fixed sample budget. The approach integrates knowledge alignment and knowledge dropout regularization strategies to enhance model generalization. As the first framework to proactively prevent model drift at the client sample level, it is extensible to heterogeneous clients and compatible with diverse federated algorithms. Extensive experiments across multiple real-world, complex scenarios demonstrate its significant superiority over state-of-the-art methods, confirming its effectiveness and robustness.
📝 Abstract
Federated learning is a paradigm of joint learning in which clients collaborate by sharing model parameters instead of data. However, in the non-iid setting, the global model experiences client drift, which can seriously affect the final performance of the model. Previous methods tend to correct the global model that has already deviated based on the loss function or gradient, overlooking the impact of the client samples. In this paper, we rethink the role of the client side and propose Federated Balanced Learning, i.e., FBL, to prevent this issue from the beginning through sample balance on the client side. Technically, FBL allows unbalanced data on the client side to achieve sample balance through knowledge filling and knowledge sampling using edge-side generation models, under the limitation of a fixed number of data samples on clients. Furthermore, we design a Knowledge Alignment Strategy to bridge the gap between synthetic and real data, and a Knowledge Drop Strategy to regularize our method. Meanwhile, we scale our method to real and complex scenarios, allowing different clients to adopt various methods, and extend our framework to further improve performance. Numerous experiments show that our method outperforms state-of-the-art baselines. The code is released upon acceptance.