Federated Learning with Heterogeneous and Private Label Sets

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning, client label sets are both heterogeneous and private—each client shares only a subset of labels with the server, while the global label set remains unknown to clients—posing challenges for collaborative model training and introducing risks of label leakage. This paper proposes a novel federated learning framework tailored for private label sets. Our approach leverages classifier ensemble, centralized fine-tuning on the server, and adaptive modifications to classic algorithms (e.g., FedAvg), enabling cross-client knowledge transfer without exposing local label spaces. Experiments demonstrate that the proposed method achieves performance close to the public-label baseline under private-label settings, with an accuracy degradation of less than 2%. It significantly mitigates label privacy leakage while maintaining manageable communication and computational overhead.

Technology Category

Application Category

📝 Abstract
Although common in real-world applications, heterogeneous client label sets are rarely investigated in federated learning (FL). Furthermore, in the cases they are, clients are assumed to be willing to share their entire label sets with other clients. Federated learning with private label sets, shared only with the central server, adds further constraints on learning algorithms and is, in general, a more difficult problem to solve. In this work, we study the effects of label set heterogeneity on model performance, comparing the public and private label settings -- when the union of label sets in the federation is known to clients and when it is not. We apply classical methods for the classifier combination problem to FL using centralized tuning, adapt common FL methods to the private label set setting, and discuss the justification of both approaches under practical assumptions. Our experiments show that reducing the number of labels available to each client harms the performance of all methods substantially. Centralized tuning of client models for representational alignment can help remedy this, but often at the cost of higher variance. Throughout, our proposed adaptations of standard FL methods perform well, showing similar performance in the private label setting as the standard methods achieve in the public setting. This shows that clients can enjoy increased privacy at little cost to model accuracy.
Problem

Research questions and friction points this paper is trying to address.

Addressing federated learning with heterogeneous client label sets
Studying effects of private versus public label set settings
Adapting FL methods to maintain privacy and performance
Innovation

Methods, ideas, or system contributions that make the work stand out.

Federated Learning with private label sets
Centralized tuning for representational alignment
Adapting standard FL methods for privacy
🔎 Similar Papers
No similar papers found.