🤖 AI Summary
This study addresses the challenge that traditional Bayesian variable selection methods struggle to accurately quantify uncertainty under model misspecification, thereby compromising selection performance. The authors propose a quasi-posterior–based variable selection approach that requires only the specification of mean and variance functions, eliminating the need for a fully specified likelihood. This method combines robustness with the advantages of Bayesian inference. For the first time, the quasi-posterior framework is systematically introduced into variable selection, leveraging Laplace approximation to efficiently compute quasi-marginal likelihoods. By avoiding full model specification while preserving desirable Bayesian properties, the proposed method achieves substantially improved selection accuracy under complex data-generating mechanisms—such as heavy-tailed errors and overdispersed count outcomes—and demonstrates strong empirical performance on real-world datasets from social sciences and genomics.
📝 Abstract
Bayesian inference offers a powerful framework for variable selection by incorporating sparsity through prior beliefs and quantifying uncertainty about parameters, leading to consistent procedures with good finite-sample performance. However, accurately quantifying uncertainty requires a correctly specified model, and there is increasing awareness of the problems that model misspecification causes for variable selection. Current solutions to this problem either require a more complex model, detracting from the interpretability of the original variable selection task, or gain robustness by moving outside of rigorous Bayesian uncertainty quantification. This paper establishes the model quasi-posterior as a principled tool for variable selection. We prove that the model quasi-posterior shares many of the desirable properties of full Bayesian variable selection, but no longer necessitates a full likelihood specification. Instead, the quasi-posterior only requires the specification of mean and variance functions, and as a result, is robust to other aspects of the data. Laplace approximations are used to approximate the quasi-marginal likelihood when it is not available in closed form to provide computational tractability. We demonstrate through extensive simulation studies that the quasi-posterior improves variable selection accuracy across a range of data-generating scenarios, including linear models with heavy-tailed errors and overdispersed count data. We further illustrate the practical relevance of the proposed approach through applications to real datasets from social science and genomics