Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity

📅 2026-04-03

📈 Citations: 0

✨ Influential: 0

career value

196K/year

🤖 AI Summary

This work addresses the challenge of adaptive bidding for a budget-constrained bidder in repeated first-price auctions under non-stationary environments, aiming to maximize cumulative utility. The authors propose an online bidding strategy based on dual gradient descent, which dynamically updates the dual variable associated with the budget constraint. They establish the first sublinear regret bound for non-stationary first-price auctions by introducing a Wasserstein-variation-based measure of non-stationarity and mitigating its impact through predictive budget allocation. By constructing a per-round budget benchmark, they achieve a tight regret bound of $\tilde{O}(\sqrt{T})$. Theoretically, the algorithm attains $\tilde{O}(\sqrt{T})$ regret plus a term capturing environmental non-stationarity when no predictions are available, and $\tilde{O}(\sqrt{T})$ regret plus prediction error when predictions are accessible, while remaining robust to deviations from the planned trajectory.

Technology Category

Application Category

📝 Abstract

In this paper, we study how a budget-constrained bidder should learn to bid adaptively in repeated first-price auctions to maximize cumulative payoff. This problem arises from the recent industry-wide shift from second-price auctions to first-price auctions in display advertising, which renders truthful bidding suboptimal. We propose a simple dual-gradient-descent-based bidding policy that maintains a dual variable for the budget constraint as the bidder consumes the budget. We analyze two settings based on the bidder's knowledge of future private values: (i) an uninformative setting where all distributional knowledge (potentially non-stationary) is entirely unknown, and (ii) an informative setting where a prediction of budget allocation is available in advance. We characterize the performance loss (regret) relative to an optimal policy with complete information. For uninformative setting, we show that the regret is ~O(sqrt(T)) plus a Wasserstein-based variation term capturing non-stationarity, which is order-optimal. In the informative setting, the variation term can be eliminated using predictions, yielding a regret of ~O(sqrt(T)) plus the prediction error. Furthermore, we go beyond the global budget constraint by introducing a refined benchmark based on a per-period budget allocation plan, achieving exactly ~O(sqrt(T)) regret. We also establish robustness guarantees when the baseline policy deviates from the planned allocation, covering both ideal and adversarial deviations.

Problem

Research questions and friction points this paper is trying to address.

first-price auctions

budget constraints

non-stationarity

adaptive bidding

online learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

adaptive bidding

first-price auctions

budget constraints