🤖 AI Summary
Existing dynamic pricing methods require prior knowledge of demand function parameters—such as Hölder smoothness or Lipschitz constants—limiting their practical applicability when such parameters are unknown. Method: We propose the first parameter-free adaptive dynamic pricing framework, integrating domain-adaptive partitioning with linear stochastic bandit modeling. It imposes no smoothness or Lipschitz assumptions and achieves efficient exploration-exploitation trade-offs via a nonparametric regret analysis. Contribution/Results: Theoretically, the algorithm attains a tight cumulative regret bound of $O(sqrt{T})$. Empirically, it significantly outperforms baseline methods across diverse unknown demand structures. Moreover, it naturally accommodates contextual extensions and exhibits strong robustness to model misspecification and environmental heterogeneity.
📝 Abstract
Dynamic pricing is crucial in sectors like e-commerce and transportation, balancing exploration of demand patterns and exploitation of pricing strategies. Existing methods often require precise knowledge of the demand function, e.g., the H{""o}lder smoothness level and Lipschitz constant, limiting practical utility. This paper introduces an adaptive approach to address these challenges without prior parameter knowledge. By partitioning the demand function's domain and employing a linear bandit structure, we develop an algorithm that manages regret efficiently, enhancing flexibility and practicality. Our Parameter-Adaptive Dynamic Pricing (PADP) algorithm outperforms existing methods, offering improved regret bounds and extensions for contextual information. Numerical experiments validate our approach, demonstrating its superiority in handling unknown demand parameters.