FedProtoKD: Dual Knowledge Distillation with Adaptive Class-wise Prototype Margin for Heterogeneous Federated Learning

📅 2025-08-26

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

In heterogeneous federated learning (HFL), prototype-based methods suffer from prototype shrinkage and suboptimal global knowledge due to weighted-averaging aggregation. To address statistical heterogeneity and preserve privacy, this paper proposes a prototype enhancement framework. Our method introduces: (1) a dual knowledge distillation mechanism jointly optimizing client logits and feature representations; (2) learnable class-level adaptive prototype margins that dynamically weight samples based on prototype proximity; and (3) a contrastive learning–based global prototype aggregation strategy to mitigate prototype collapse under non-IID data and model heterogeneity. Extensive experiments across diverse heterogeneous settings demonstrate consistent improvements—average accuracy gains of 1.13%–34.13% over state-of-the-art HFL baselines—validating the framework’s effectiveness and robustness.

Technology Category

Application Category

📝 Abstract

Heterogeneous Federated Learning (HFL) has gained attention for its ability to accommodate diverse models and heterogeneous data across clients. Prototype-based HFL methods emerge as a promising solution to address statistical heterogeneity and privacy challenges, paving the way for new advancements in HFL research. This method focuses on sharing only class-representative prototypes among heterogeneous clients. However, these prototypes are often aggregated on the server using weighted averaging, leading to sub-optimal global knowledge; these cause the shrinking of aggregated prototypes, which negatively affects the model performance in scenarios when models are heterogeneous and data distributions are extremely non-IID. We propose FedProtoKD in a Heterogeneous Federated Learning setting, using an enhanced dual-knowledge distillation mechanism to improve the system performance with clients' logits and prototype feature representation. We aim to resolve the prototype margin-shrinking problem using a contrastive learning-based trainable server prototype by leveraging a class-wise adaptive prototype margin. Furthermore, we assess the importance of public samples using the closeness of the sample's prototype to its class representative prototypes, which enhances learning performance. FedProtoKD achieved average improvements of 1.13% up to 34.13% accuracy across various settings and significantly outperforms existing state-of-the-art HFL methods.

Problem

Research questions and friction points this paper is trying to address.

Addresses prototype margin-shrinking in heterogeneous federated learning

Improves global knowledge aggregation with dual-knowledge distillation

Enhances performance in non-IID data distributions using adaptive margins

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual knowledge distillation with logits and prototypes

Adaptive class-wise prototype margin for contrastive learning

Public sample importance assessment via prototype closeness

🔎 Similar Papers

Adaptive Guidance for Local Training in Heterogeneous Federated Learning