🤖 AI Summary
This work addresses the degradation of global model generalization and representation misalignment in federated learning caused by client data heterogeneity, scarcity, and class imbalance. To mitigate these issues, we propose FedQuad, the first method to integrate quadruplet metric learning into federated learning. During aggregation, FedQuad explicitly compresses intra-class representations while enlarging inter-class distances, thereby alleviating representation alignment bias under non-IID data distributions. Combined with a stochastic client selection mechanism and embedding space stability analysis, our approach effectively prevents representation collapse. Extensive experiments demonstrate that FedQuad consistently outperforms existing baselines across various non-IID settings on CIFAR-10, CIFAR-100, and Tiny-ImageNet, significantly improving both model generalization and representation consistency.
📝 Abstract
Federated Learning (FL) enables decentralised model training across distributed clients without requiring data centralisation. However, the generalisation performance of the global model is usually degraded by data heterogeneity across clients, particularly under limited data availability and class imbalance. To address this challenge, we propose FedQuad, a novel method that explicitly enforces minimising intra-class representations while enabling inter-class splits across clients. By jointly minimising distances between positive pairs and maximising distances between negative pairs, the proposed approach mitigates representation misalignment introduced during model aggregation. We evaluate our method on CIFAR-10, CIFAR-100, and Tiny-ImageNet under diverse non-IID settings and varying numbers of clients, demonstrating consistent improvements over existing baselines. Additionally, we provide a comprehensive analysis of metric learning-based approaches in both centralised and federated environments, highlighting their effectiveness in alleviating representation collapse under heterogeneous data distributions.