🤖 AI Summary
To address domain shift arising from client-wise domain heterogeneity in federated learning, this paper proposes a dual-granularity collaborative prototype learning framework. At the intra-domain level, it constructs robust class prototypes via local feature alignment and MixUp-based augmentation; at the inter-domain level, it introduces a weighted prototype aggregation mechanism to enable cross-domain knowledge fusion. This is the first work to jointly model local diversity and cross-domain consistency, explicitly mitigating domain shift. Evaluated on multi-domain benchmarks—Digits, Office-10, and PACS—the method achieves average accuracy improvements of 3.2–5.8% over existing federated prototype approaches. It significantly enhances the global model’s generalization capability and cross-domain adaptability while preserving privacy and communication efficiency inherent to federated learning.
📝 Abstract
Federated Learning (FL) has emerged as a decentralized machine learning technique, allowing clients to train a global model collaboratively without sharing private data. However, most FL studies ignore the crucial challenge of heterogeneous domains where each client has a distinct feature distribution, which is common in real-world scenarios. Prototype learning, which leverages the mean feature vectors within the same classes, has become a prominent solution for federated learning under domain skew. However, existing federated prototype learning methods only consider inter-domain prototypes on the server and overlook intra-domain characteristics. In this work, we introduce a novel federated prototype learning method, namely I$^2$PFL, which incorporates $ extbf{I}$ntra-domain and $ extbf{I}$nter-domain $ extbf{P}$rototypes, to mitigate domain shifts and learn a generalized global model across multiple domains in federated learning. To construct intra-domain prototypes, we propose feature alignment with MixUp-based augmented prototypes to capture the diversity of local domains and enhance the generalization of local features. Additionally, we introduce a reweighting mechanism for inter-domain prototypes to generate generalized prototypes to provide inter-domain knowledge and reduce domain skew across multiple clients. Extensive experiments on the Digits, Office-10, and PACS datasets illustrate the superior performance of our method compared to other baselines.