Stabilizing Federated Learning under Extreme Heterogeneity with HeteRo-Select

📅 2025-08-08

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

To address training instability and significant accuracy degradation in federated learning under highly heterogeneous data (e.g., label skew α = 0.1), this paper proposes HeteRo-Select—a client selection framework that jointly optimizes utility, fairness, update velocity, and data diversity. It integrates high-loss-priority sampling with diversity-enhancement strategies to mitigate bias and improve generalization. Under strong regularization assumptions, we establish theoretical convergence guarantees. Experiments on CIFAR-10 demonstrate that HeteRo-Select achieves a peak accuracy of 74.75% and a final accuracy of 72.76%, with only a 1.99% drop in stability—substantially outperforming baselines such as Oort. Moreover, the framework maintains high communication efficiency while ensuring robust long-term performance across heterogeneous settings.

Technology Category

Application Category

📝 Abstract

Federated Learning (FL) is a machine learning technique that often suffers from training instability due to the diverse nature of client data. Although utility-based client selection methods like Oort are used to converge by prioritizing high-loss clients, they frequently experience significant drops in accuracy during later stages of training. We propose a theoretical HeteRo-Select framework designed to maintain high performance and ensure long-term training stability. We provide a theoretical analysis showing that when client data is very different (high heterogeneity), choosing a smart subset of client participation can reduce communication more effectively compared to full participation. Our HeteRo-Select method uses a clear, step-by-step scoring system that considers client usefulness, fairness, update speed, and data variety. It also shows convergence guarantees under strong regularization. Our experimental results on the CIFAR-10 dataset under significant label skew ($α=0.1$) support the theoretical findings. The HeteRo-Select method performs better than existing approaches in terms of peak accuracy, final accuracy, and training stability. Specifically, HeteRo-Select achieves a peak accuracy of $74.75%$, a final accuracy of $72.76%$, and a minimal stability drop of $1.99%$. In contrast, Oort records a lower peak accuracy of $73.98%$, a final accuracy of $71.25%$, and a larger stability drop of $2.73%$. The theoretical foundations and empirical performance in our study make HeteRo-Select a reliable solution for real-world heterogeneous FL problems.

Problem

Research questions and friction points this paper is trying to address.

Addresses training instability in Federated Learning due to data heterogeneity

Proposes HeteRo-Select for high performance and long-term stability

Improves accuracy and reduces communication in highly heterogeneous settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

HeteRo-Select framework ensures stable FL training

Smart client subset selection reduces communication costs

Step-by-step scoring system improves accuracy and fairness

🔎 Similar Papers

FedPeWS: Personalized Warmup via Subnetworks for Enhanced Heterogeneous Federated Learning