$L_2$-Regularized Empirical Risk Minimization Guarantees Small Smooth Calibration Error

๐Ÿ“… 2025-10-15
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
This paper investigates whether standard โ„“โ‚‚-regularized empirical risk minimization (ERM) can achieve good probabilistic calibration *intrinsically*, without relying on post-hoc calibration or dedicated calibration regularizers. Methodologically, it establishes the first generalization analysis framework for smooth calibration error (smCE), explicitly linking calibration performance to optimization error, regularization strength, and Rademacher complexity. In reproducing kernel Hilbert spaces (RKHS), it derives finite-sample upper bounds on smCE for kernel ridge regression and logistic regression. Experimentally, standard โ„“โ‚‚-ERM alone achieves calibration performance competitive with state-of-the-art calibration-specific methods. The core contribution is the theoretical revelation that โ„“โ‚‚ regularization inherently provides calibration guarantees, accompanied by the first interpretable and quantifiable generalization theory for calibrationโ€”bridging regularization, optimization, and statistical learning theory in a unified framework.

Technology Category

Application Category

๐Ÿ“ Abstract
Calibration of predicted probabilities is critical for reliable machine learning, yet it is poorly understood how standard training procedures yield well-calibrated models. This work provides the first theoretical proof that canonical $L_{2}$-regularized empirical risk minimization directly controls the smooth calibration error (smCE) without post-hoc correction or specialized calibration-promoting regularizer. We establish finite-sample generalization bounds for smCE based on optimization error, regularization strength, and the Rademacher complexity. We then instantiate this theory for models in reproducing kernel Hilbert spaces, deriving concrete guarantees for kernel ridge and logistic regression. Our experiments confirm these specific guarantees, demonstrating that $L_{2}$-regularized ERM can provide a well-calibrated model without boosting or post-hoc recalibration. The source code to reproduce all experiments is available at https://github.com/msfuji0211/erm_calibration.
Problem

Research questions and friction points this paper is trying to address.

Theoretical proof that L2-regularized empirical risk minimization controls smooth calibration error
Establishes generalization bounds for calibration error based on optimization and regularization
Demonstrates well-calibrated models without post-hoc correction or specialized regularizers
Innovation

Methods, ideas, or system contributions that make the work stand out.

L2-regularized empirical risk minimization controls calibration error
Theoretical proof for smooth calibration without post-hoc correction
Generalization bounds based on optimization error and regularization
๐Ÿ”Ž Similar Papers
No similar papers found.