h-calibration: Rethinking Classifier Recalibration with Probabilistic Error-Bounded Objective

📅 2025-06-22
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Deep neural networks often yield miscalibrated output probabilities, undermining their reliability despite strong predictive performance. To address this, we propose the *h-calibration* framework—a novel, differentiable, and theoretically grounded calibration paradigm with bounded approximation error. Our method formalizes probabilistic learning and ideal calibration, integrates posterior recalibration algorithm design, and establishes a rigorous convergence relationship between statistical estimator consistency and theoretical error bounds. Unlike conventional approaches based on proper scoring rules, h-calibration overcomes ten fundamental limitations of existing methods—including non-differentiability, lack of theoretical guarantees, and poor generalization across architectures and datasets. Theoretical analysis demonstrates its superiority in terms of both calibration fidelity and optimization tractability. Extensive experiments on standard post-hoc calibration benchmarks achieve state-of-the-art performance, validating its effectiveness, generalizability, and robustness across diverse models and datasets.

Technology Category

Application Category

📝 Abstract
Deep neural networks have demonstrated remarkable performance across numerous learning tasks but often suffer from miscalibration, resulting in unreliable probability outputs. This has inspired many recent works on mitigating miscalibration, particularly through post-hoc recalibration methods that aim to obtain calibrated probabilities without sacrificing the classification performance of pre-trained models. In this study, we summarize and categorize previous works into three general strategies: intuitively designed methods, binning-based methods, and methods based on formulations of ideal calibration. Through theoretical and practical analysis, we highlight ten common limitations in previous approaches. To address these limitations, we propose a probabilistic learning framework for calibration called h-calibration, which theoretically constructs an equivalent learning formulation for canonical calibration with boundedness. On this basis, we design a simple yet effective post-hoc calibration algorithm. Our method not only overcomes the ten identified limitations but also achieves markedly better performance than traditional methods, as validated by extensive experiments. We further analyze, both theoretically and experimentally, the relationship and advantages of our learning objective compared to traditional proper scoring rule. In summary, our probabilistic framework derives an approximately equivalent differentiable objective for learning error-bounded calibrated probabilities, elucidating the correspondence and convergence properties of computational statistics with respect to theoretical bounds in canonical calibration. The theoretical effectiveness is verified on standard post-hoc calibration benchmarks by achieving state-of-the-art performance. This research offers valuable reference for learning reliable likelihood in related fields.
Problem

Research questions and friction points this paper is trying to address.

Addressing miscalibration in deep neural networks' probability outputs
Overcoming limitations in existing post-hoc recalibration methods
Proposing h-calibration for error-bounded probabilistic learning framework
Innovation

Methods, ideas, or system contributions that make the work stand out.

Probabilistic learning framework for calibration
Error-bounded differentiable calibration objective
Simple effective post-hoc calibration algorithm
🔎 Similar Papers
No similar papers found.
Wenjian Huang
Wenjian Huang
Peking University
BioMedical Image&Signal ProcessingMachine LearningArtificial IntelligenceStatistical LearningComputer Vision
Guiping Cao
Guiping Cao
PCL; SUSTech; CVTE Research; XJTU
Deep LearningComputer VisionMedical Image Processing
Jiahao Xia
Jiahao Xia
Research Fellow, University of Technology Sydney
Deep Learning
Jingkun Chen
Jingkun Chen
University of Oxford
Medical image analysisComputer visionMachine learning
H
Hao Wang
Research Inst. of Trustworthy Autonomous Systems & Dept. of Computer Science and Engineering, SUSTech, China
J
Jianguo Zhang
Research Inst. of Trustworthy Autonomous Systems & Dept. of Computer Science and Engineering, SUSTech, China; Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Dept. of Computer Science and Engineering, SUSTech, and Peng Cheng Lab, China