🤖 AI Summary
Existing shrinkage priors for high-dimensional linear regression with naturally grouped covariates struggle to simultaneously achieve group-level and within-group adaptive sparsity. To address this, we propose the Group R2D2 shrinkage prior—the first extension of the R²-Directed Dirichlet (R2D2) framework to grouped sparse settings. By placing a Dirichlet prior on the explained variance (R²) attributable to each group, Group R2D2 jointly models intra-group correlation and inter-group sparsity balancing. Built upon Bayesian hierarchical modeling and R²-induced Dirichlet decomposition, it ensures both theoretical interpretability and practical MCMC feasibility. Extensive simulations and real-data analyses demonstrate that Group R2D2 significantly outperforms mainstream methods—including Lasso and Horseshoe—in estimation accuracy, variable selection consistency, posterior inference reliability, and predictive performance.
📝 Abstract
Shrinkage priors are a popular Bayesian paradigm to handle sparsity in high-dimensional regression. Still limited, however, is a flexible class of shrinkage priors to handle grouped sparsity, where covariates exhibit some natural grouping structure. This paper proposes a novel extension of the $R^2$-induced Dirichlet Decomposition (R2D2) prior to accommodate grouped variable selection in linear regression models. The proposed method, called the Group R2D2 prior, employs a Dirichlet prior distribution on the coefficient of determination for each group, allowing for a flexible and adaptive shrinkage that operates at both group and individual variable levels. This approach improves the original R2D2 prior to handle grouped predictors, providing a balance between within-group dependence and group-level sparsity. We present several theoretical properties of this proposed prior distribution while also developing a Markov Chain Monte Carlo algorithm. Through simulation studies and real-data analysis, we demonstrate that our method outperforms traditional shrinkage priors in terms of both estimation accuracy, inference and prediction.