A Tight Regret Analysis of Non-Parametric Repeated Contextual Brokerage

📅 2025-03-03
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper studies the contextual repeated posted-price auction problem: bidders’ valuations are zero-mean perturbations of an unknown, arbitrarily time-varying market value, with similar contexts inducing nearby market values. The objective is to maximize net utility via context-dependent pricing while minimizing regret against a clairvoyant oracle. We propose nonparametric online learning algorithms for both full-feedback and bandit-feedback settings, establishing the first tight $O(sqrt{T})$ regret bound. We rigorously prove that our algorithms achieve a $1/2$-approximation ratio—demonstrating theoretical optimality and revealing the fundamental performance limit under distribution-free valuation models. This work provides the first provably optimal nonparametric framework for dynamic market mechanism design.

Technology Category

Application Category

📝 Abstract
We study a contextual version of the repeated brokerage problem. In each interaction, two traders with private valuations for an item seek to buy or sell based on the learner's-a broker-proposed price, which is informed by some contextual information. The broker's goal is to maximize the traders' net utility-also known as the gain from trade-by minimizing regret compared to an oracle with perfect knowledge of traders' valuation distributions. We assume that traders' valuations are zero-mean perturbations of the unknown item's current market value-which can change arbitrarily from one interaction to the next-and that similar contexts will correspond to similar market prices. We analyze two feedback settings: full-feedback, where after each interaction the traders' valuations are revealed to the broker, and limited-feedback, where only transaction attempts are revealed. For both feedback types, we propose algorithms achieving tight regret bounds. We further strengthen our performance guarantees by providing a tight 1/2-approximation result showing that the oracle that knows the traders' valuation distributions achieves at least 1/2 of the gain from trade of the omniscient oracle that knows in advance the actual realized traders' valuations.
Problem

Research questions and friction points this paper is trying to address.

Maximize traders' net utility in contextual brokerage.
Minimize regret compared to an oracle with perfect knowledge.
Analyze full and limited feedback settings for tight regret bounds.
Innovation

Methods, ideas, or system contributions that make the work stand out.

Contextual information informs broker's pricing strategy.
Algorithms achieve tight regret bounds for feedback types.
1/2-approximation result strengthens performance guarantees.
🔎 Similar Papers
No similar papers found.
F
F. Bachoc
IMT, University of Toulouse, Institut universitaire de France (IUF)
T
Tommaso Cesari
EECS, University of Ottawa
Roberto Colomboni
Roberto Colomboni
Machine Learning Researcher at POLIMI (Milan) and UNIMI (Milan)
Statistical learning theoryOnline learningMulti-Armed Bandits