🤖 AI Summary
For large-scale stochastic computer simulations with input-dependent noise, existing heteroskedastic Gaussian process (hetGP) methods suffer from high computational complexity and reliance on point estimation, limiting scalability and uncertainty quantification. This paper proposes a Bayesian hetGP model that— for the first time—integrates elliptical slice sampling (ESS) with the Vecchia approximation: ESS enables full posterior distribution inference for latent variance functions, while Vecchia’s sparse conditioning reduces computational complexity to linear in sample size, overcoming scalability bottlenecks. Crucially, uncertainty is quantified via posterior variance integration rather than point estimates, substantially improving probabilistic calibration. The method is validated on a real-world lake temperature simulation dataset comprising over 9 million runs, as well as multiple benchmark tasks, demonstrating superior efficiency, accuracy, and scalability. An open-source R package, bhetGP, implementing the methodology is publicly available on CRAN.
📝 Abstract
Many computer simulations are stochastic and exhibit input dependent noise. In such situations, heteroskedastic Gaussian processes (hetGPs) make ideal surrogates as they estimate a latent, non-constant variance. However, existing hetGP implementations are unable to deal with large simulation campaigns and use point-estimates for all unknown quantities, including latent variances. This limits applicability to small experiments and undercuts uncertainty. We propose a Bayesian hetGP using elliptical slice sampling (ESS) for posterior variance integration, and the Vecchia approximation to circumvent computational bottlenecks. We show good performance for our upgraded hetGP capability, compared to alternatives, on a benchmark example and a motivating corpus of more than 9-million lake temperature simulations. An open source implementation is provided as bhetGP on CRAN.