Tackling Data Heterogeneity in Federated Learning through Knowledge Distillation with Inequitable Aggregation

📅 2025-06-25

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

In large-scale federated learning, partial client participation exacerbates data heterogeneity—such as label and quantity skew—leading to degraded model performance. To address this, we propose KDIA, a framework synergizing knowledge distillation with imbalanced aggregation. Its core contributions are: (1) a weighted teacher-model aggregation mechanism that incorporates client participation frequency, count, and local dataset size; (2) a server-side generator that synthesizes near-IID features to facilitate robust teacher–student knowledge transfer; and (3) a unified training objective integrating knowledge distillation, self-distillation, and GAN-based feature generation to enhance generalization under heterogeneity. Extensive experiments on CIFAR-10, CIFAR-100, and CINIC-10 demonstrate that KDIA achieves significantly higher accuracy than baselines under low participation rates and strong data heterogeneity, requiring fewer communication rounds. Notably, performance gains increase with the degree of heterogeneity, confirming KDIA’s effectiveness in challenging real-world FL settings.

Technology Category

Application Category

📝 Abstract

Federated learning aims to train a global model in a distributed environment that is close to the performance of centralized training. However, issues such as client label skew, data quantity skew, and other heterogeneity problems severely degrade the model's performance. Most existing methods overlook the scenario where only a small portion of clients participate in training within a large-scale client setting, whereas our experiments show that this scenario presents a more challenging federated learning task. Therefore, we propose a Knowledge Distillation with teacher-student Inequitable Aggregation (KDIA) strategy tailored to address the federated learning setting mentioned above, which can effectively leverage knowledge from all clients. In KDIA, the student model is the average aggregation of the participating clients, while the teacher model is formed by a weighted aggregation of all clients based on three frequencies: participation intervals, participation counts, and data volume proportions. During local training, self-knowledge distillation is performed. Additionally, we utilize a generator trained on the server to generate approximately independent and identically distributed (IID) data features locally for auxiliary training. We conduct extensive experiments on the CIFAR-10/100/CINIC-10 datasets and various heterogeneous settings to evaluate KDIA. The results show that KDIA can achieve better accuracy with fewer rounds of training, and the improvement is more significant under severe heterogeneity.

Problem

Research questions and friction points this paper is trying to address.

Addressing data heterogeneity in federated learning

Improving model performance with limited client participation

Enhancing accuracy through knowledge distillation and inequitable aggregation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Knowledge Distillation with inequitable aggregation

Weighted teacher model using three frequencies

Generator for IID data features locally

🔎 Similar Papers

No similar papers found.

Authors to Follow