Bayesian Models for Joint Selection of Features and Auto-Regressive Lags: Theory and Applications in Environmental and Financial Forecasting

📅 2025-08-12

📈 Citations: 0

✨ Influential: 0

career value

194K/year

🤖 AI Summary

This paper addresses linear time-series regression models featuring autocorrelated errors and lagged covariates, proposing the first Bayesian framework for *joint selection* of important covariates and the autoregressive order of the error process. Methodologically, we develop a hierarchical spike-and-slab prior that enables simultaneous Bayesian variable selection for both regressors and error lags—novel in the literature—and design a two-stage MCMC algorithm to enhance computational efficiency and selection accuracy in high-dimensional settings. We establish theoretical high-dimensional consistency under mild regularity conditions. Empirical evaluations on groundwater depth forecasting and S&P 500 log-return modeling demonstrate substantial reductions in mean squared prediction error (MSPE), improved identification of the true model, and enhanced predictive robustness. The method is particularly effective in domains with strong temporal dependence, including finance, hydrology, and meteorology.

Technology Category

Application Category

📝 Abstract

We develop a Bayesian framework for variable selection in linear regression with autocorrelated errors, accommodating lagged covariates and autoregressive structures. This setting occurs in time series applications where responses depend on contemporaneous or past explanatory variables and persistent stochastic shocks, including financial modeling, hydrological forecasting, and meteorological applications requiring temporal dependency capture. Our methodology uses hierarchical Bayesian models with spike-and-slab priors to simultaneously select relevant covariates and lagged error terms. We propose an efficient two-stage MCMC algorithm separating sampling of variable inclusion indicators and model parameters to address high-dimensional computational challenges. Theoretical analysis establishes posterior selection consistency under mild conditions, even when candidate predictors grow exponentially with sample size, common in modern time series with many potential lagged variables. Through simulations and real applications (groundwater depth prediction, S&P 500 log returns modeling), we demonstrate substantial gains in variable selection accuracy and predictive performance. Compared to existing methods, our framework achieves lower MSPE, improved true model component identification, and greater robustness with autocorrelated noise, underscoring practical utility for model interpretation and forecasting in autoregressive settings.

Problem

Research questions and friction points this paper is trying to address.

Selecting relevant features and autoregressive lags in time series

Handling high-dimensional data with autocorrelated errors efficiently

Improving predictive performance in environmental and financial forecasting

Innovation

Methods, ideas, or system contributions that make the work stand out.

Bayesian framework for variable and lag selection

Spike-and-slab priors for hierarchical modeling

Two-stage MCMC algorithm for efficiency

🔎 Similar Papers

No similar papers found.