๐ค AI Summary
This work addresses the challenge of achieving low regret in contextual dynamic pricing under agnostic noise that is non-Lipschitz, exhibits jumps, and contains atomsโsettings where existing methods struggle. Focusing on linear valuations with bounded-support agnostic noise, the paper proposes a polynomial-time algorithm that integrates stochastic parameter estimation, conservative residual grid exploration, and a confidence-guided one-time redirection mechanism. The approach achieves, for the first time, a $\tilde O(T^{2/3})$ regret bound under non-Lipschitz demand, matching the information-theoretic lower bound of Kleinberg and Leighton (2003) up to logarithmic factors. This result improves upon the previous best-known $\tilde O(T^{3/4})$ regret and closes a long-standing theoretical gap in linear contextual pricing.
๐ Abstract
We study contextual dynamic pricing with linear valuations and bounded-support agnostic noise, whose induced demand curve may be non-Lipschitz with arbitrary jumps and atoms. Such discontinuities break the cross-context interpolation arguments used by smooth-demand pricing algorithms, while the best previous method achieved only $\tilde O(T^{3/4})$ regret. We propose Conservative-Markdown Redirect-UCB Pricing, a polynomial-time algorithm that combines randomized parameter estimation, conservative residual-grid probing, and confidence-based one-step redirection. Our algorithm achieves $\tilde O(T^{2/3})$ optimal regret, matching the known lower bounds of Kleinberg and Leighton (2003) up to logarithmic factors and improving over the previous upper bound of Xu and Wang (2022). Under stochastic well-conditioned contexts, this closes the long-existing open regret gap in linear-valuation contextual pricing under agnostic non-Lipschitz noise distribution.