Enforcing Calibration in Multi-Output Probabilistic Regression with Pre-rank Regularization

📅 2025-10-24
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multivariate calibration in multi-output probabilistic regression suffers from ambiguous definitions and practical implementation challenges. Method: This paper proposes a generic pre-ranking regularization framework that (i) employs pre-ranking functions for active calibration—not merely diagnostic assessment—and introduces a joint regularized loss integrating marginal and multivariate calibration; (ii) devises a PCA-based pre-ranking method to automatically identify calibration-sensitive principal directions in predictive distributions; and (iii) incorporates a PIT uniformity deviation penalty, compatible with highest-density-region calibration and copula calibration. The framework is plug-and-play, seamlessly integrated into any probabilistic model’s training objective. Contribution/Results: Evaluated on 18 real-world multi-output datasets, the framework significantly improves multivariate calibration across diverse pre-ranking functions without compromising predictive accuracy.

Technology Category

Application Category

📝 Abstract
Probabilistic models must be well calibrated to support reliable decision-making. While calibration in single-output regression is well studied, defining and achieving multivariate calibration in multi-output regression remains considerably more challenging. The existing literature on multivariate calibration primarily focuses on diagnostic tools based on pre-rank functions, which are projections that reduce multivariate prediction-observation pairs to univariate summaries to detect specific types of miscalibration. In this work, we go beyond diagnostics and introduce a general regularization framework to enforce multivariate calibration during training for arbitrary pre-rank functions. This framework encompasses existing approaches such as highest density region calibration and copula calibration. Our method enforces calibration by penalizing deviations of the projected probability integral transforms (PITs) from the uniform distribution, and can be added as a regularization term to the loss function of any probabilistic predictor. Specifically, we propose a regularization loss that jointly enforces both marginal and multivariate pre-rank calibration. We also introduce a new PCA-based pre-rank that captures calibration along directions of maximal variance in the predictive distribution, while also enabling dimensionality reduction. Across 18 real-world multi-output regression datasets, we show that unregularized models are consistently miscalibrated, and that our methods significantly improve calibration across all pre-rank functions without sacrificing predictive accuracy.
Problem

Research questions and friction points this paper is trying to address.

Enforcing multivariate calibration in multi-output probabilistic regression models
Addressing calibration challenges beyond single-output regression diagnostics
Regularizing deviations from uniform distribution in probability integral transforms
Innovation

Methods, ideas, or system contributions that make the work stand out.

Regularization framework enforces multivariate calibration during training
Penalizes deviations of projected PITs from uniform distribution
Introduces PCA-based pre-rank for calibration along variance directions
🔎 Similar Papers
No similar papers found.
N
Naomi Desobry
Department of Big Data and Machine Learning, University of Mons
E
Elnura Zhalieva
Department of Statistics and Data Science, Mohamed Bin Zayed University of Artificial Intelligence
Souhaib Ben Taieb
Souhaib Ben Taieb
MBZUAI, UMONS
Artificial intelligenceStatisticsForecastingTime seriesConformal inference