Private Federated Multiclass Post-hoc Calibration

📅 2025-10-02

📈 Citations: 0

✨ Influential: 0

career value

190K/year

🤖 AI Summary

In federated learning (FL), post-hoc calibration of multi-class models remains challenging—especially in high-stakes domains such as healthcare and finance—where client data exhibit strong heterogeneity, are inaccessible centrally, and must comply with strict privacy constraints. This paper introduces, for the first time, histogram binning and temperature scaling into the FL framework, proposing two novel post-processing calibration strategies: weighted binning and federated temperature scaling. The former is designed for highly heterogeneous settings, while the latter is compatible with user-level differential privacy. Experiments demonstrate that both methods significantly mitigate calibration degradation induced by data heterogeneity. Federated temperature scaling achieves superior calibration performance under differential privacy guarantees, whereas weighted binning attains optimal accuracy in non-private settings. Collectively, our approaches establish a verifiable, deployable calibration paradigm for reliable predictions in privacy-preserving FL.

Technology Category

Application Category

📝 Abstract

Calibrating machine learning models so that predicted probabilities better reflect the true outcome frequencies is crucial for reliable decision-making across many applications. In Federated Learning (FL), the goal is to train a global model on data which is distributed across multiple clients and cannot be centralized due to privacy concerns. FL is applied in key areas such as healthcare and finance where calibration is strongly required, yet federated private calibration has been largely overlooked. This work introduces the integration of post-hoc model calibration techniques within FL. Specifically, we transfer traditional centralized calibration methods such as histogram binning and temperature scaling into federated environments and define new methods to operate them under strong client heterogeneity. We study (1) a federated setting and (2) a user-level Differential Privacy (DP) setting and demonstrate how both federation and DP impacts calibration accuracy. We propose strategies to mitigate degradation commonly observed under heterogeneity and our findings highlight that our federated temperature scaling works best for DP-FL whereas our weighted binning approach is best when DP is not required.

Problem

Research questions and friction points this paper is trying to address.

Calibrating federated learning models for accurate probability predictions

Addressing calibration challenges under client heterogeneity and privacy constraints

Developing private post-hoc calibration methods for federated environments

Innovation

Methods, ideas, or system contributions that make the work stand out.

Post-hoc calibration integration in federated learning

Centralized methods adapted for client heterogeneity

Differential privacy-aware strategies for calibration accuracy

🔎 Similar Papers

Addressing Data Heterogeneity in Federated Learning with Adaptive Normalization-Free Feature Recalibration