Localized exploration in contextual dynamic pricing achieves dimension-free regret

📅 2024-12-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the problem of designing dimension-free optimal policies for contextual dynamic pricing under linear demand models, focusing on the exploration-exploitation trade-off. We propose the Local Exploration-then-Commit (LetC) algorithm, which adaptively adjusts prices in three phases: pure exploration, neighborhood refinement, and pure exploitation. Our contributions are threefold: (i) we establish the first dimension-free minimax-optimal regret bound for contextual dynamic pricing; (ii) we develop a unified theoretical framework that characterizes exploration-exploitation balance across the entire time horizon; and (iii) we derive a novel critical inequality that captures the fundamental trade-off inherent in dynamic pricing. Methodologically, LetC integrates phased adaptive exploration, local neighborhood refinement, and a regularized regression–inspired analysis. Theoretically, it achieves optimal regret when the time horizon exceeds a polynomial function of the covariate dimension. Extensive experiments on synthetic and real-world market data empirically validate its efficacy.

Technology Category

Application Category

📝 Abstract
We study the problem of contextual dynamic pricing with a linear demand model. We propose a novel localized exploration-then-commit (LetC) algorithm which starts with a pure exploration stage, followed by a refinement stage that explores near the learned optimal pricing policy, and finally enters a pure exploitation stage. The algorithm is shown to achieve a minimax optimal, dimension-free regret bound when the time horizon exceeds a polynomial of the covariate dimension. Furthermore, we provide a general theoretical framework that encompasses the entire time spectrum, demonstrating how to balance exploration and exploitation when the horizon is limited. The analysis is powered by a novel critical inequality that depicts the exploration-exploitation trade-off in dynamic pricing, mirroring its existing counterpart for the bias-variance trade-off in regularized regression. Our theoretical results are validated by extensive experiments on synthetic and real-world data.
Problem

Research questions and friction points this paper is trying to address.

Dynamic Pricing
Dimensionality Independence
Linear Demand Model
Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Pricing Algorithm
Dimension-Invariant Optimization
Exploration-Exploitation Balance
🔎 Similar Papers
No similar papers found.
J
Jinhang Chai
Department of Operations Research and Financial Engineering, Princeton University
Yaqi Duan
Yaqi Duan
Department of Technology, Operations and Statistics at NYU Stern
J
Jianqing Fan
Department of Operations Research and Financial Engineering, Princeton University
Kaizheng Wang
Kaizheng Wang
Columbia University
Machine LearningStatisticsOptimization