No-Regret Online Autobidding Algorithms in First-price Auctions

📅 2025-10-19

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

This paper studies the online bidding problem in repeated first-price auctions under return-on-investment (ROI) and budget constraints, designing low-regret algorithms benchmarked against the ex-post optimal randomized policy. Addressing the strategic non-truthfulness inherent in first-price auctions—previously limiting regret analysis to weaker benchmarks or ignoring ROI constraints—we establish, for the first time, a near-optimal regret bound relative to the stochastic ex-post optimum. Methodologically, we integrate online convex optimization with multi-armed bandit theory, combining gradient estimation and confidence interval construction to devise adaptive bidding strategies under both full-feedback and bandit-feedback settings. Theoretical analysis yields regret bounds of $widetilde{O}(sqrt{T})$ under full feedback and $widetilde{O}(T^{3/4})$ under bandit feedback—both strictly improving upon prior results. Our framework is the first to achieve stochastic ex-post optimal benchmarking while jointly respecting ROI and budget constraints in first-price auctions.

Technology Category

Application Category

📝 Abstract

Automated bidding to optimize online advertising with various constraints, e.g. ROI constraints and budget constraints, is widely adopted by advertisers. A key challenge lies in designing algorithms for non-truthful mechanisms with ROI constraints. While prior work has addressed truthful auctions or non-truthful auctions with weaker benchmarks, this paper provides a significant improvement: We develop online bidding algorithms for repeated first-price auctions with ROI constraints, benchmarking against the optimal randomized strategy in hindsight. In the full feedback setting, where the maximum competing bid is observed, our algorithm achieves a near-optimal $widetilde{O}(sqrt{T})$ regret bound, and in the bandit feedback setting (where the bidder only observes whether the bidder wins each auction), our algorithm attains $widetilde{O}(T^{3/4})$ regret bound.

Problem

Research questions and friction points this paper is trying to address.

Designing no-regret autobidding algorithms for first-price auctions

Optimizing advertising with ROI and budget constraints

Achieving low regret under full and bandit feedback settings

Innovation

Methods, ideas, or system contributions that make the work stand out.

Online bidding algorithms for first-price auctions

Achieves near-optimal regret with full feedback

Attains sublinear regret under bandit feedback setting

🔎 Similar Papers

No similar papers found.