🤖 AI Summary
This work addresses the inefficiency of standard No-U-Turn Sampler (NUTS) in high-dimensional Bayesian hierarchical models, where strong posterior correlations among parameters severely hinder sampling performance. The authors propose Sparse NUTS (SNUTS), which introduces a sparse precision matrix as a preconditioner—an approach not previously employed in NUTS. This matrix is efficiently estimated via Laplace approximation using the Template Model Builder (TMB) and seamlessly integrated into Stan to enable rapid sampling. By leveraging sparsity, SNUTS overcomes the limitations of conventional diagonal or dense preconditioners. Empirical evaluation across 17 benchmark problems demonstrates speedups of 10–100× over Stan’s default configuration, outperforming variational alternatives such as Pathfinder. The method scales effectively to parameter spaces exceeding 10⁴ dimensions, substantially enhancing the efficiency of Bayesian inference for high-dimensional, sparse, and strongly correlated models.
📝 Abstract
Analysts routinely use Bayesian hierarchical models to understand natural processes. The no-U-turn sampler (NUTS) is the most widely used algorithm to sample high-dimensional, continuously differentiable models. But NUTS is slowed by high correlations, especially in high dimensions, limiting the complexity of applied analyses. Here we introduce Sparse NUTS (SNUTS), which preconditions (decorrelates and descales) posteriors using a sparse precision matrix ($Q$). We use Template Model Builder (TMB) to efficiently compute $Q$ from the mode of the Laplace approximation to the marginal posterior, then pass the preconditioned posterior to NUTS through the Bayesian software Stan for sampling. We apply SNUTS to seventeen diverse case studies to demonstrate that preconditioning with $Q$ converges one to two orders of magnitude faster than Stan's industry standard diagonal or dense preconditioners. SNUTS also outperforms preconditioning with the inverse of the covariance estimated with Pathfinder variational inference. SNUTS does not improve sampling efficiency for models with the highly varying curvature found in funnels, wide tails, or multiple modes. SNUTS is most advantageous, and can be scaled beyond $10^4$ parameters, in the presence of high dimensionality, sparseness, and high correlations, all of which are widespread in applied statistics. An open-source implementation of SNUTS is provided in the R package SparseNUTS.