Calibration and Discrimination Optimization Using Clusters of Learned Representation

📅 2025-10-22

📈 Citations: 0

✨ Influential: 0

career value

174K/year

🤖 AI Summary

To address the insufficient calibration of machine learning models in high-stakes decision-making (e.g., clinical prediction) and the common trade-off between calibration and discriminative performance, this paper proposes a representation-clustering-based ensemble calibration framework. First, it learns sample representations to capture latent-space structure, then performs adaptive clustering to partition heterogeneous subpopulations. Dedicated calibration functions are constructed per cluster, optimized jointly to minimize calibration error while preserving discriminative power. The framework is model-agnostic—compatible with diverse base models and evaluation metrics—and requires no modification to original model architectures. Experiments show it improves calibration accuracy of state-of-the-art calibration methods from 82.28% to 100%, significantly enhancing predictive reliability, while maintaining or even improving discriminative metrics such as AUC. Its core innovation lies in aligning calibration granularity with intrinsic representation structure, and synergistically boosting both calibration and discrimination via multi-calibrator ensembling and joint optimization.

Technology Category

Application Category

📝 Abstract

Machine learning models are essential for decision-making and risk assessment, requiring highly reliable predictions in terms of both discrimination and calibration. While calibration often receives less attention, it is crucial for critical decisions, such as those in clinical predictions. We introduce a novel calibration pipeline that leverages an ensemble of calibration functions trained on clusters of learned representations of the input samples to enhance overall calibration. This approach not only improves the calibration score of various methods from 82.28% up to 100% but also introduces a unique matching metric that ensures model selection optimizes both discrimination and calibration. Our generic scheme adapts to any underlying representation, clustering, calibration methods and metric, offering flexibility and superior performance across commonly used calibration methods.

Problem

Research questions and friction points this paper is trying to address.

Optimizing calibration and discrimination for reliable predictions

Enhancing calibration using clustered learned representations

Ensuring model selection balances discrimination and calibration metrics

Innovation

Methods, ideas, or system contributions that make the work stand out.

Ensemble calibration functions trained on clustered representations

Matching metric optimizes both discrimination and calibration

Generic scheme adapts to various methods and metrics

🔎 Similar Papers

Optimizing Estimators of Squared Calibration Errors in Classification