Gradient Equilibrium in Online Learning: Theory and Applications

📅 2025-01-14

📈 Citations: 0

✨ Influential: 0

career value

209K/year

🤖 AI Summary

This paper addresses prediction bias, miscalibrated quantiles, and unfair Elo ratings induced by data distribution shifts in online learning. We propose “gradient balancing”—a novel paradigm requiring the mean of iterative loss gradients to converge to zero: a weaker condition than traditional no-regret guarantees, yet more practical and interpretable. We establish its first theoretical framework, enabling convergence under constant step sizes. The framework unifies guarantees across three critical tasks: black-box prediction debiasing, quantile calibration, and Elo score unbiasing. Our method is compatible with online gradient descent, mirror descent, and posterior optimization, and integrates distributionally robust analysis. Experiments demonstrate its effectiveness on regression, classification, and quantile estimation tasks under distribution shift, significantly improving predictive fairness and calibration. Moreover, it yields a lightweight, retraining-free post-processing solution.

Technology Category

Application Category

📝 Abstract

We present a new perspective on online learning that we refer to as gradient equilibrium: a sequence of iterates achieves gradient equilibrium if the average of gradients of losses along the sequence converges to zero. In general, this condition is not implied by nor implies sublinear regret. It turns out that gradient equilibrium is achievable by standard online learning methods such as gradient descent and mirror descent with constant step sizes (rather than decaying step sizes, as is usually required for no regret). Further, as we show through examples, gradient equilibrium translates into an interpretable and meaningful property in online prediction problems spanning regression, classification, quantile estimation, and others. Notably, we show that the gradient equilibrium framework can be used to develop a debiasing scheme for black-box predictions under arbitrary distribution shift, based on simple post hoc online descent updates. We also show that post hoc gradient updates can be used to calibrate predicted quantiles under distribution shift, and that the framework leads to unbiased Elo scores for pairwise preference prediction.

Problem

Research questions and friction points this paper is trying to address.

Online Learning

Distribution Shift

Elo Rating Fairness

Innovation

Methods, ideas, or system contributions that make the work stand out.

Gradient Balancing

Distribution Shift Correction

Calibrated Quantile Prediction

🔎 Similar Papers

No similar papers found.

Instacart

CA, NY, CT, NJ$240,000—$253,500 USDWA$230,000—$243,000 USDOR, DE, ME, MA, MD, NH, RI, VT, DC, PA, VA, CO, TX, IL, HI$221,000—$233,000 USDAll other states$201,000—$212,000 USD

remote

2026 Fall Applied Science Internship - Reinforcement Learning & Optimization (Machine Learning) - United States, PhD Student Science Recruiting

Amazon

Arlington, VA, USA / Bellevue, WA, USA / Boston, MA, USA

Research Engineer, Monetization AI