🤖 AI Summary
This study addresses bilateral trade under distributional perturbations, where the underlying stochastic distribution is subject to adversarial contamination, with the goal of maximizing gains from trade (GFT) under a global budget balance constraint. The work proposes a contamination-aware adaptive pricing algorithm that, for the first time in this setting, achieves no-regret learning with respect to the optimal budget-balanced price distribution. By integrating online learning with an adaptive adjustment mechanism, the algorithm dynamically balances performance across varying levels of contamination. Theoretical analysis shows that the algorithm attains a regret bound of Õ(T^{3/4}) + O(C log T) relative to the optimal distribution, and maintains a worst-case guarantee of Õ(T^{3/4}) even under fully adversarial conditions, thereby bridging the theoretical gap between purely stochastic and purely adversarial regimes.
📝 Abstract
We address the problem of maximizing Gain from Trade (GFT) in repeated buyer-seller exchanges subject to global budget balance constraints. While this problem is well-understood in purely adversarial and stochastic settings, these environments exhibit a sharp dichotomy: adversarial environments allow for no-regret learning against the best fixed-price mechanism, whereas stochastic environments allow for no-regret learning against the best distribution over prices that is budget balanced in expectation. This gap is significant, as policies balanced in expectation can increase the GFT by a multiplicative factor of two. In this work, we bridge these extremes by studying perturbed markets, where an underlying stochastic distribution is subject to an adversarial corruption $C$. We design an algorithm that adaptively scales with the level of corruption, achieving an $\tilde{\mathcal{O}}(T^{3/4}) + \mathcal{O}(C\log(T))$ regret bound against the best budget-balanced distribution over prices. Simultaneously, our algorithm maintains the worst-case $\tilde{\mathcal{O}}(T^{3/4})$ regret bound relative to a per-round budget-balanced baseline, ensuring optimality even in fully adversarial environments.