🤖 AI Summary
To address performance degradation of the global model in federated learning caused by non-IID client data distributions, this paper proposes a distribution-aware client selection mechanism. It dynamically evaluates each client’s label distribution against a target distribution—either the balanced (uniform) distribution or the federated joint distribution—and prioritizes clients with higher alignment for participation in each training round. The method innovatively introduces a switchable target distribution mode to adaptively handle both local and global imbalance scenarios, and incorporates a lightweight distribution statistics module that seamlessly integrates with mainstream algorithms such as FedAvg, FedProx, and SCAFFOLD. Experiments on CIFAR-10 and Fashion-MNIST demonstrate that the proposed approach significantly improves convergence speed and final model accuracy. Specifically, aligning with the balanced distribution yields optimal performance under local imbalance, whereas alignment with the joint distribution is more effective under global imbalance.
📝 Abstract
Federated learning (FL) is a distributed learning paradigm that allows multiple clients to jointly train a shared model while maintaining data privacy. Despite its great potential for domains with strict data privacy requirements, the presence of data imbalance among clients is a thread to the success of FL, as it causes the performance of the shared model to decrease. To address this, various studies have proposed enhancements to existing FL strategies, particularly through client selection methods that mitigate the detrimental effects of data imbalance. In this paper, we propose an extension to existing FL strategies, which selects active clients that best align the current label distribution with one of two target distributions, namely a balanced distribution or the federations combined label distribution. Subsequently, we empirically verify the improvements through our distribution-controlled client selection on three common FL strategies and two datasets. Our results show that while aligning the label distribution with a balanced distribution yields the greatest improvements facing local imbalance, alignment with the federation's combined label distribution is superior for global imbalance.