🤖 AI Summary
This work addresses the online learning problem faced by a profit-maximizing intermediary in bilateral trade under a smoothed adversarial setting, where buyer and seller valuations are generated by a smoothed adversary. The study proposes a novel approach that integrates hierarchical net constructions over the action space with chaining-based algorithmic analysis to effectively exploit the continuity of valuation sequences. This method extends the classical tight √T regret bound—previously established only under i.i.d. assumptions—to the broader smoothed adversarial regime, achieving for the first time an Õ(√T) minimax regret bound in this economic model. The result matches the theoretical lower bound known in stochastic settings and demonstrates broad applicability through its successful integration into a joint advertising mechanism design, thereby filling a critical gap in the regret-bound landscape for online learning in bilateral trade.
📝 Abstract
Bilateral trade models the task of intermediating between two strategic agents, a seller and a buyer, who wish to trade a good. We study this problem from the perspective of a profit-maximizing broker within an online learning framework, where the agents' valuations are generated by a smooth adversary.
We devise a learning algorithm that guarantees a $\tilde{O}(\sqrt{T})$ regret bound, which is tight in the time horizon $T$ up to poly-logarithmic factors. This matches the minimax rate for the stochastic i.i.d. case, and is also well separated from the adversarial setting, where sublinear-regret is unattainable. By extending the strong regret guarantees from the i.i.d. case to the smooth adversary, we significantly broaden the scope of settings where such fast rate is achievable, while closing an important gap in the regret landscape of this fundamental economic problem.
To overcome the challenges posed by this adversary, we leverage a continuity property of smooth instances and combines this with a hierarchical net-construction of the broker's action space, which is analyzed via algorithmic chaining. We showcase the applicability of these techniques by deriving a similarly tight $\tilde{O}(\sqrt{T})$ regret bound for a related mechanism design model: the joint ads problem.