🤖 AI Summary
To address the computational inefficiency of posterior inference in stationary Bayesian vector autoregressive (BVAR) models, this paper introduces variational inference (VI) into the stationary BVAR framework for the first time, proposing a novel VI algorithm that rigorously enforces stationarity constraints. The method jointly and efficiently approximates the posterior distribution of model parameters, the one-step-ahead predictive distribution, and the log-predictive score. In both simulation studies and empirical applications to U.S. macroeconomic data, it achieves prediction accuracy comparable to Gibbs sampling while reducing computation time by two to three orders of magnitude. Its scalability is particularly pronounced in high-dimensional settings and across multiple-model comparisons. The key innovation lies in embedding the stationarity-inducing prior structure directly into the variational objective function, thereby ensuring theoretical consistency with stationarity requirements while delivering substantial computational gains.
📝 Abstract
The steady-state Bayesian vector autoregression (BVAR) makes it possible to incorporate prior information about the long-run mean of the process. This has been shown in many studies to substantially improve forecasting performance, and the model is routinely used for forecasting and macroeconomic policy analysis at central banks and other financial institutions. Steady-steady BVARs are estimated using Gibbs sampling, which is time-consuming for the increasingly popular large-scale BVAR models with many variables. We propose a fast variational inference (VI) algorithm for approximating the parameter posterior and predictive distribution of the steady-state BVAR, as well as log predictive scores for model comparison. We use simulated and real US macroeconomic data to show that VI produces results that are very close to those from Gibbs sampling. The computing time of VI can be orders of magnitude lower than Gibbs sampling, in particular for log predictive scores, and VI is shown to scale much better with the number of time series in the system.