Uncertainty in Bayesian Leave-One-Out Cross-Validation Based Model Comparison

📅 2020-08-24
📈 Citations: 85
Influential: 4
📄 PDF
🤖 AI Summary
Standard errors for LOO-CV predictive performance estimates in Bayesian model comparison are systematically underestimated—especially under small samples, model misspecification, or near-identical model predictions—due to persistent skewness in the LOO-CV error distribution, even asymptotically. Method: We establish the first theoretical proof that LOO-CV error skewness remains pathological in the infinite-data limit, invalidating Gaussian approximations. Building on this, we develop a rigorous yet practical uncertainty quantification framework integrating asymptotic analysis, Monte Carlo simulation, and explicit error distribution modeling. Contribution/Results: We propose a diagnosable skewness warning boundary and a robust calibration procedure that substantially improves uncertainty reliability. Our work introduces the first distribution-shape–aware diagnostic criterion for Bayesian model comparison, enabling a paradigm shift from point-estimate–based comparison to distribution-aware comparison.
📝 Abstract
Leave-one-out cross-validation (LOO-CV) is a popular method for comparing Bayesian models based on their estimated predictive performance on new, unseen, data. Estimating the uncertainty of the resulting LOO-CV estimate is a complex task and it is known that the commonly used standard error estimate is often too small. We analyse the frequency properties of the LOO-CV estimator and study the uncertainty related to it. We provide new results of the properties of the uncertainty both theoretically and empirically and discuss the challenges of estimating it. We show that problematic cases include: comparing models with similar predictions, misspecified models, and small data. In these cases, there is a weak connection in the skewness of the sampling distribution and the distribution of the error of the LOO-CV estimator. We show that it is possible that the problematic skewness of the error distribution, which occurs when the models make similar predictions, does not fade away when the data size grows to infinity in certain situations.
Problem

Research questions and friction points this paper is trying to address.

Estimating predictive performance uncertainty in Bayesian LOO-CV
Comparing model predictive performances with associated uncertainties
Analyzing normal approximation accuracy for predictive performance differences
Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian LOO-CV for predictive performance comparison
Analyzing uncertainty with normal and higher moments
Empirical validation on hierarchical and spline models
🔎 Similar Papers
No similar papers found.