🤖 AI Summary
This paper addresses regression with [0,1]-valued targets and introduces Bet Loss—the first loss function achieving a variance-dependent second-order generalization bound. Unlike existing approaches, Bet Loss requires no prior knowledge of label variance or explicit modeling of the label distribution, yet adapts automatically to the variance of the optimal predictor under i.i.d. assumptions. Theoretically, its second-order bound is strictly tighter than classical first-order bounds; notably, it yields faster convergence rates in low-variance regimes. This work establishes the first fully data-dependent, variance-agnostic second-order adaptive generalization analysis for bounded regression—overcoming a fundamental limitation of standard losses (e.g., squared loss), which inherently yield only variance-independent first-order bounds.
📝 Abstract
We consider the $[0,1]$-valued regression problem in the i.i.d. setting. In a related problem called cost-sensitive classification, citet{foster21efficient} have shown that the log loss minimizer achieves an improved generalization bound compared to that of the squared loss minimizer in the sense that the bound scales with the cost of the best classifier, which can be arbitrarily small depending on the problem at hand. Such a result is often called a first-order bound. For $[0,1]$-valued regression, we first show that the log loss minimizer leads to a similar first-order bound. We then ask if there exists a loss function that achieves a variance-dependent bound (also known as a second order bound), which is a strict improvement upon first-order bounds. We answer this question in the affirmative by proposing a novel loss function called the betting loss. Our result is ``variance-adaptive'' in the sense that the bound is attained extit{without any knowledge about the variance}, which is in contrast to modeling label (or reward) variance or the label distribution itself explicitly as part of the function class such as distributional reinforcement learning.