Tackling Data Heterogeneity in Federated Learning through Knowledge Distillation with Inequitable Aggregation

📅 2025-06-25
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In large-scale federated learning, partial client participation exacerbates data heterogeneity—such as label and quantity skew—leading to degraded model performance. To address this, we propose KDIA, a framework synergizing knowledge distillation with imbalanced aggregation. Its core contributions are: (1) a weighted teacher-model aggregation mechanism that incorporates client participation frequency, count, and local dataset size; (2) a server-side generator that synthesizes near-IID features to facilitate robust teacher–student knowledge transfer; and (3) a unified training objective integrating knowledge distillation, self-distillation, and GAN-based feature generation to enhance generalization under heterogeneity. Extensive experiments on CIFAR-10, CIFAR-100, and CINIC-10 demonstrate that KDIA achieves significantly higher accuracy than baselines under low participation rates and strong data heterogeneity, requiring fewer communication rounds. Notably, performance gains increase with the degree of heterogeneity, confirming KDIA’s effectiveness in challenging real-world FL settings.

Technology Category

Application Category

📝 Abstract
Federated learning aims to train a global model in a distributed environment that is close to the performance of centralized training. However, issues such as client label skew, data quantity skew, and other heterogeneity problems severely degrade the model's performance. Most existing methods overlook the scenario where only a small portion of clients participate in training within a large-scale client setting, whereas our experiments show that this scenario presents a more challenging federated learning task. Therefore, we propose a Knowledge Distillation with teacher-student Inequitable Aggregation (KDIA) strategy tailored to address the federated learning setting mentioned above, which can effectively leverage knowledge from all clients. In KDIA, the student model is the average aggregation of the participating clients, while the teacher model is formed by a weighted aggregation of all clients based on three frequencies: participation intervals, participation counts, and data volume proportions. During local training, self-knowledge distillation is performed. Additionally, we utilize a generator trained on the server to generate approximately independent and identically distributed (IID) data features locally for auxiliary training. We conduct extensive experiments on the CIFAR-10/100/CINIC-10 datasets and various heterogeneous settings to evaluate KDIA. The results show that KDIA can achieve better accuracy with fewer rounds of training, and the improvement is more significant under severe heterogeneity.
Problem

Research questions and friction points this paper is trying to address.

Addressing data heterogeneity in federated learning
Improving model performance with limited client participation
Enhancing accuracy through knowledge distillation and inequitable aggregation
Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation with inequitable aggregation
Weighted teacher model using three frequencies
Generator for IID data features locally
🔎 Similar Papers
No similar papers found.