🤖 AI Summary
This study addresses dynamic pricing against strategic buyers who possess fixed private valuations and discount future utility. The authors investigate how to efficiently learn buyer behavior and minimize regret by introducing a menu mechanism—comprising allocation-payment contracts—that leverages the revelation principle from mechanism design in conjunction with online learning techniques. This approach enables either indirect learning of buyer types or their direct elicitation. The key contribution lies in establishing, for the first time, the equivalence between indirect learning and direct revelation mechanisms in achieving optimal regret, thereby uncovering an intrinsic unity between online learning and the revelation principle. As a result, the work achieves a regret bound of $O(T_\gamma \log T_\gamma)$, which significantly improves upon existing results.
📝 Abstract
We study dynamic pricing where a seller repeatedly interacts with a strategic, non-myopic buyer who has a fixed private valuation and discounts future utility. Prior work focused exclusively on posted-price mechanisms, which only extract binary accept/reject signals. For our first result, we show that menu mechanisms-offering allocation-payment contracts are able to achieve $O(T_γ\log T_γ)$ regret, where $T_γ$ is the buyer's effective discounted time horizon, improving all prior bounds. Our second contribution is more conceptual in nature. The problem of dynamic pricing sits at the intersection of two paradigms: adaptive learning in computer science / machine learning and revelation-principle-based mechanism design in economics-yet their relationship has remained unclear. We establish a fundamental equivalence: indirect learning mechanisms and direct revelation mechanisms achieve identical optimal regret. The adaptive, data-driven algorithms of online learning and explicit type elicitation are two languages towards solving the same problem; hence, learning is revelation in disguise.