Occam's Razor is Only as Sharp as Your ELBO

📅 2026-04-28

📈 Citations: 0

✨ Influential: 0

career value

184K/year

🤖 AI Summary

This study investigates how the rank assumption of the Gaussian approximate posterior covariance matrix in variational inference affects evidence lower bound (ELBO)-driven Bayesian model selection. We demonstrate that both low-rank and full-rank assumptions can lead the ELBO to favor overfitted solutions in overparameterized regression models, sometimes yielding model selection outcomes that contradict those based on the true marginal likelihood. Our work is the first to reveal that the ELBO may fail as a reliable proxy for the marginal likelihood under specific covariance rank constraints, thereby challenging its commonly assumed role as an embodiment of Occam’s razor. These findings caution against naive choices of posterior covariance structure in large-scale Bayesian modeling, as inappropriate rank assumptions can significantly degrade model selection performance.

📝 Abstract

The marginal likelihood, also known as the evidence, is regarded as a mathematical embodiment of Occam's razor, enabling model selection that avoids overfitting. The evidence lower bound (ELBO) objective from variational inference has also been used for similar purposes. Prior work has shown that restricting the approximate posterior family via a mean-field approximation can lead the ELBO to underfit. In this paper, we show how ELBO-based hyperparameter learning in a simple over-parameterized regression model can also produce overfitting, depending on the assumed rank of the covariance matrix in a Gaussian approximate posterior. Surprisingly, among only the underfit and overfit options, Bayesian model selection via the evidence itself sometimes prefers the overfit version, while the ELBO does not. Bayesian practitioners hoping to scale to large models should be cautious about how reduced-rank assumptions needed for tractability may impact the potential for model selection.

Problem

Research questions and friction points this paper is trying to address.

ELBO

overfitting

Bayesian model selection

evidence

covariance rank

Innovation

Methods, ideas, or system contributions that make the work stand out.

evidence lower bound

marginal likelihood

overfitting