Partial Knowledge Distillation for Alleviating the Inherent Inter-Class Discrepancy in Federated Learning

📅 2024-11-23

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

183K/year

🤖 AI Summary

In federated learning, an inherent weak-class phenomenon arises from data semantics: even under balanced global class distributions, certain classes—due to semantic ambiguity or low feature discriminability—are persistently misclassified, yielding inter-class accuracy disparities up to 36.9%. This work is the first to systematically identify, characterize, and model this phenomenon. We propose misclassification-driven Partial Knowledge Distillation (PKD), a novel client-local knowledge transfer mechanism that activates *only* when weak classes are misclassified—enabling selective, on-demand distillation instead of conventional global, uniform distillation. Our federated collaborative training framework achieves an average 10.7% improvement in weak-class accuracy on FashionMNIST and CIFAR-10, while significantly reducing inter-class accuracy standard deviation—thereby mitigating this intrinsic bias without compromising overall performance.

Technology Category

Application Category

📝 Abstract

Substantial efforts have been devoted to alleviating the impact of the long-tailed class distribution in federated learning. In this work, we observe an interesting phenomenon that certain weak classes consistently exist even for class-balanced learning. These weak classes, different from the minority classes in the previous works, are inherent to data and remain fairly consistent for various network structures, learning paradigms, and data partitioning methods. The inherent inter-class accuracy discrepancy can reach over 36.9% for federated learning on the FashionMNIST and CIFAR-10 datasets, even when the class distribution is balanced both globally and locally. In this study, we empirically analyze the potential reason for this phenomenon. Furthermore, a partial knowledge distillation (PKD) method is proposed to improve the model's classification accuracy for weak classes. In this approach, knowledge transfer is initiated upon the occurrence of specific misclassifications within certain weak classes. Experimental results show that the accuracy of weak classes can be improved by 10.7%, reducing the inherent inter-class discrepancy effectively.

Problem

Research questions and friction points this paper is trying to address.

Addresses inherent inter-class accuracy discrepancy in federated learning

Proposes partial knowledge distillation to improve weak class accuracy

Reduces accuracy gap by 10.7% for consistently weak classes

Innovation

Methods, ideas, or system contributions that make the work stand out.

Partial knowledge distillation for weak classes

Triggered by specific misclassifications in weak classes

Reduces inter-class discrepancy by 10.7%

🔎 Similar Papers

Adaptive Self-Distillation for Minimizing Client Drift in Heterogeneous Federated Learning