🤖 AI Summary
This work addresses the issue of semantic drift among clients in federated learning, which often leads to inaccurate global prototypes and degrades model generalization. To mitigate this, the authors propose a hyper-prototype mechanism that aligns local sample features to learnable global class prototypes through gradient matching. The approach further enhances inter-class separability and intra-class consistency by integrating mutual contrastive learning with client-adaptive margins and consistency regularization. Unlike conventional prototype averaging strategies that induce semantic shift, the proposed method preserves semantic coherence in the global representation across diverse heterogeneous settings. Extensive experiments demonstrate state-of-the-art performance on multiple benchmark datasets, validating its effectiveness in achieving robust and semantically consistent federated models.
📝 Abstract
Federated Learning (FL) enables collaborative training of distributed clients while protecting privacy. To enhance generalization capability in FL, prototype-based FL is in the spotlight, since shared global prototypes offer semantic anchors for aligning client-specific local prototypes. However, existing methods update global prototypes at the prototype-level via averaging local prototypes or refining global anchors, which often leads to semantic drift across clients and subsequently yields a misaligned global signal. To alleviate this issue, we introduce hyper-prototypes, defined by a set of learnable global class-wise prototypes to preserve underlying semantic knowledge across clients. The hyper-prototypes are optimized via gradient matching to align with class-relevant characteristics distilled directly from clients' real samples, rather than prototype-level descriptors. We further propose FedHPro, a Federated Hyper-Prototype Learning framework, to leverage hyper-prototypes to promote inter-class separability via mutual-contrastive learning with client-specific margin, while encouraging intra-class uniformity through a consistency penalty. Comprehensive experiments under diverse heterogeneous scenarios confirm that 1) hyper-prototypes produce a more semantically consistent global signal, and 2) FedHPro achieves state-of-the-art performance on several benchmark datasets. Code is available at \href{https://github.com/mala-lab/FedHPro}{https://github.com/mala-lab/FedHPro}.