Fast and Scalable Score-Based Kernel Calibration Tests

📅 2025-10-16
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses three key challenges in assessing the calibration of probabilistic models: high computational cost, poor scalability, and difficulty in controlling Type-I error. To this end, we propose Kernel-based Conditional Calibration Score Difference (KCCSD), a nonparametric hypothesis test grounded in kernel methods. KCCSD introduces a novel class of score-based kernel functions that enable density-free estimation and integrates Stein discrepancy with the conditional goodness-of-fit testing framework, thereby circumventing explicit expectation approximation. The test statistic is efficiently constructed via a U-statistic, ensuring both computational efficiency and scalability. Theoretically, KCCSD provides finite-sample guarantees on strict Type-I error control under mild regularity conditions. Empirical evaluations on diverse synthetic benchmarks demonstrate that KCCSD significantly outperforms existing methods—achieving superior statistical power, favorable scalability with sample size and dimensionality, and robust Type-I/Type-II error control.

Technology Category

Application Category

📝 Abstract
We introduce the Kernel Calibration Conditional Stein Discrepancy test (KCCSD test), a non-parametric, kernel-based test for assessing the calibration of probabilistic models with well-defined scores. In contrast to previous methods, our test avoids the need for possibly expensive expectation approximations while providing control over its type-I error. We achieve these improvements by using a new family of kernels for score-based probabilities that can be estimated without probability density samples, and by using a conditional goodness-of-fit criterion for the KCCSD test's U-statistic. We demonstrate the properties of our test on various synthetic settings.
Problem

Research questions and friction points this paper is trying to address.

Assessing calibration of probabilistic models with scores
Avoiding expensive expectation approximations in calibration tests
Providing type-I error control for kernel-based tests
Innovation

Methods, ideas, or system contributions that make the work stand out.

Kernel-based test for probabilistic model calibration
Avoids expensive expectation approximations
Uses score-based kernels without density samples
🔎 Similar Papers
No similar papers found.