Private Federated Multiclass Post-hoc Calibration

📅 2025-10-02
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In federated learning (FL), post-hoc calibration of multi-class models remains challenging—especially in high-stakes domains such as healthcare and finance—where client data exhibit strong heterogeneity, are inaccessible centrally, and must comply with strict privacy constraints. This paper introduces, for the first time, histogram binning and temperature scaling into the FL framework, proposing two novel post-processing calibration strategies: weighted binning and federated temperature scaling. The former is designed for highly heterogeneous settings, while the latter is compatible with user-level differential privacy. Experiments demonstrate that both methods significantly mitigate calibration degradation induced by data heterogeneity. Federated temperature scaling achieves superior calibration performance under differential privacy guarantees, whereas weighted binning attains optimal accuracy in non-private settings. Collectively, our approaches establish a verifiable, deployable calibration paradigm for reliable predictions in privacy-preserving FL.

Technology Category

Application Category

📝 Abstract
Calibrating machine learning models so that predicted probabilities better reflect the true outcome frequencies is crucial for reliable decision-making across many applications. In Federated Learning (FL), the goal is to train a global model on data which is distributed across multiple clients and cannot be centralized due to privacy concerns. FL is applied in key areas such as healthcare and finance where calibration is strongly required, yet federated private calibration has been largely overlooked. This work introduces the integration of post-hoc model calibration techniques within FL. Specifically, we transfer traditional centralized calibration methods such as histogram binning and temperature scaling into federated environments and define new methods to operate them under strong client heterogeneity. We study (1) a federated setting and (2) a user-level Differential Privacy (DP) setting and demonstrate how both federation and DP impacts calibration accuracy. We propose strategies to mitigate degradation commonly observed under heterogeneity and our findings highlight that our federated temperature scaling works best for DP-FL whereas our weighted binning approach is best when DP is not required.
Problem

Research questions and friction points this paper is trying to address.

Calibrating federated learning models for accurate probability predictions
Addressing calibration challenges under client heterogeneity and privacy constraints
Developing private post-hoc calibration methods for federated environments
Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-hoc calibration integration in federated learning
Centralized methods adapted for client heterogeneity
Differential privacy-aware strategies for calibration accuracy
🔎 Similar Papers
No similar papers found.