🤖 AI Summary
To address severe global model bias in federated learning (FL) under non-IID and long-tailed data distributions, this paper proposes FedSM, a client-centric framework. Methodologically, FedSM leverages a pre-trained vision-language model to quantify semantic relatedness among classes, guiding semantic alignment between local features and global prototypes to generate class-consistent pseudo-features; introduces a probabilistic class selection strategy to enhance pseudo-sample diversity; and incorporates lightweight local classifier retraining. The approach effectively mitigates domain shift and long-tail bias. Empirically, FedSM achieves state-of-the-art accuracy across multiple long-tailed FL benchmarks while maintaining low communication overhead, high computational efficiency, and strong robustness to data heterogeneity and label imbalance.
📝 Abstract
Federated Learning (FL) enables collaborative model training across decentralized clients without sharing private data. However, FL suffers from biased global models due to non-IID and long-tail data distributions. We propose extbf{FedSM}, a novel client-centric framework that mitigates this bias through semantics-guided feature mixup and lightweight classifier retraining. FedSM uses a pretrained image-text-aligned model to compute category-level semantic relevance, guiding the category selection of local features to mix-up with global prototypes to generate class-consistent pseudo-features. These features correct classifier bias, especially when data are heavily skewed. To address the concern of potential domain shift between the pretrained model and the data, we propose probabilistic category selection, enhancing feature diversity to effectively mitigate biases. All computations are performed locally, requiring minimal server overhead. Extensive experiments on long-tail datasets with various imbalanced levels demonstrate that FedSM consistently outperforms state-of-the-art methods in accuracy, with high robustness to domain shift and computational efficiency.