FedProtoKD: Dual Knowledge Distillation with Adaptive Class-wise Prototype Margin for Heterogeneous Federated Learning

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In heterogeneous federated learning (HFL), prototype-based methods suffer from prototype shrinkage and suboptimal global knowledge due to weighted-averaging aggregation. To address statistical heterogeneity and preserve privacy, this paper proposes a prototype enhancement framework. Our method introduces: (1) a dual knowledge distillation mechanism jointly optimizing client logits and feature representations; (2) learnable class-level adaptive prototype margins that dynamically weight samples based on prototype proximity; and (3) a contrastive learning–based global prototype aggregation strategy to mitigate prototype collapse under non-IID data and model heterogeneity. Extensive experiments across diverse heterogeneous settings demonstrate consistent improvements—average accuracy gains of 1.13%–34.13% over state-of-the-art HFL baselines—validating the framework’s effectiveness and robustness.

Technology Category

Application Category

📝 Abstract
Heterogeneous Federated Learning (HFL) has gained attention for its ability to accommodate diverse models and heterogeneous data across clients. Prototype-based HFL methods emerge as a promising solution to address statistical heterogeneity and privacy challenges, paving the way for new advancements in HFL research. This method focuses on sharing only class-representative prototypes among heterogeneous clients. However, these prototypes are often aggregated on the server using weighted averaging, leading to sub-optimal global knowledge; these cause the shrinking of aggregated prototypes, which negatively affects the model performance in scenarios when models are heterogeneous and data distributions are extremely non-IID. We propose FedProtoKD in a Heterogeneous Federated Learning setting, using an enhanced dual-knowledge distillation mechanism to improve the system performance with clients' logits and prototype feature representation. We aim to resolve the prototype margin-shrinking problem using a contrastive learning-based trainable server prototype by leveraging a class-wise adaptive prototype margin. Furthermore, we assess the importance of public samples using the closeness of the sample's prototype to its class representative prototypes, which enhances learning performance. FedProtoKD achieved average improvements of 1.13% up to 34.13% accuracy across various settings and significantly outperforms existing state-of-the-art HFL methods.
Problem

Research questions and friction points this paper is trying to address.

Addresses prototype margin-shrinking in heterogeneous federated learning
Improves global knowledge aggregation with dual-knowledge distillation
Enhances performance in non-IID data distributions using adaptive margins
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dual knowledge distillation with logits and prototypes
Adaptive class-wise prototype margin for contrastive learning
Public sample importance assessment via prototype closeness
🔎 Similar Papers
No similar papers found.
Md Anwar Hossen
Md Anwar Hossen
Graduate Student, CS Department, Iowa State University
Federated Learning | Heterogenous System | Foundation Models
F
Fatema Siddika
Department of Computer Science, Iowa State University, Ames, USA
W
Wensheng Zhang
Department of Computer Science, Iowa State University, Ames, USA
A
Anuj Sharma
Department of Computer Science, Iowa State University, Ames, USA
Ali Jannesari
Ali Jannesari
Associate Professor, Iowa State University
high-performance computingmachine learningparallel computingsoftware analytics