A Generalized Bias-Variance Decomposition for Bregman Divergences

📅 2025-11-11
📈 Citations: 21
Influential: 5
📄 PDF
🤖 AI Summary
Classical bias–variance decomposition is restricted to squared error loss, limiting its applicability in statistical learning where non-quadratic losses (e.g., log loss, exponential loss) are common. Method: Leveraging convex analysis and statistical decision theory, we generalize the decomposition to arbitrary Bregman divergences as prediction errors, deriving a rigorous formula for maximum likelihood estimators under exponential family distributions and specifying precise conditions for its validity. Contribution/Results: Our framework unifies previously fragmented results for specific losses, filling a fundamental theoretical gap. It enhances interpretability and pedagogical utility of the bias–variance trade-off, and provides a principled, general-purpose tool for model diagnostics and generalization analysis under non-squared error settings—enabling coherent error decomposition across diverse loss functions grounded in information geometry.

Technology Category

Application Category

📝 Abstract
The bias-variance decomposition is a central result in statistics and machine learning, but is typically presented only for the squared error. We present a generalization of the bias-variance decomposition where the prediction error is a Bregman divergence, which is relevant to maximum likelihood estimation with exponential families. While the result is already known, there was not previously a clear, standalone derivation, so we provide one for pedagogical purposes. A version of this note previously appeared on the author's personal website without context. Here we provide additional discussion and references to the relevant prior literature.
Problem

Research questions and friction points this paper is trying to address.

Generalizes bias-variance decomposition for Bregman divergences
Extends beyond squared error to exponential family likelihoods
Provides clear derivation for previously known decomposition result
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generalizes bias-variance decomposition for Bregman divergences
Applies to maximum likelihood with exponential families
Provides clear standalone derivation for pedagogical purposes
🔎 Similar Papers
No similar papers found.