🤖 AI Summary
In online learning, accurate estimation of time-varying losses is fundamentally hindered by the bias–variance trade-off; existing prequential estimators rely on strong assumptions and are highly sensitive to hyperparameters. This paper proposes a recursive, unbiased variance-reduced estimator grounded in algorithmic stability—marking the first application of stability analysis to online unbiased loss estimation. Our method employs stability-driven dynamic weighting and adaptive hyperparameter tuning, achieving theoretically consistent, real-time updates with constant time and memory overhead, without requiring prior knowledge or ground-truth information for calibration. Empirically, it significantly outperforms mainstream baselines across diverse online convex optimization and stochastic learning tasks. Notably, it remains competitive even against strong baselines tuned using true gradients. These results demonstrate the estimator’s effectiveness, robustness, and practical utility.
📝 Abstract
Online learning algorithms continually update their models as data arrive, making it essential to accurately estimate the expected loss at the current time step. The prequential method is an effective estimation approach which can be practically deployed in various ways. However, theoretical guarantees have previously been established under strong conditions on the algorithm, and practical algorithms have hyperparameters which require careful tuning. We introduce OEUVRE, an estimator that evaluates each incoming sample on the function learned at the current and previous time steps, recursively updating the loss estimate in constant time and memory. We use algorithmic stability, a property satisfied by many popular online learners, for optimal updates and prove consistency, convergence rates, and concentration bounds for our estimator. We design a method to adaptively tune OEUVRE's hyperparameters and test it across diverse online and stochastic tasks. We observe that OEUVRE matches or outperforms other estimators even when their hyperparameters are tuned with oracle access to ground truth.