🤖 AI Summary
This work addresses the problem of efficiently constructing a uniform ε-approximation of a multidimensional cumulative distribution function (CDF) under the constraint of one-bit bandit feedback, with applications to learning fixed-price mechanisms. By extending the classical Dvoretzky–Kiefer–Wolfowitz (DKW) inequality to the bandit feedback setting and combining it with grid discretization and logarithmic factor control techniques, the authors establish the first tight sample complexity upper bound of $\tilde{O}(1/\varepsilon^3)$ for uniform CDF approximation in multiple dimensions. Notably, the dependence on dimensionality appears only in logarithmic terms, exhibiting near-dimension-free behavior. This result yields tight sample complexity and regret bounds for fixed-price mechanisms in small-scale markets such as bilateral trade, providing a theoretical foundation for mechanism design under high-dimensional bandit feedback.
📝 Abstract
We study the sample complexity of learning a uniform approximation of an $n$-dimensional cumulative distribution function (CDF) within an error $\epsilon>0$, when observations are restricted to a minimal one-bit feedback. This serves as a counterpart to the multivariate DKW inequality under''full feedback'', extending it to the setting of''bandit feedback''. Our main result shows a near-dimensional-invariance in the sample complexity: we get a uniform $\epsilon$-approximation with a sample complexity $\frac{1}{\epsilon^3}{\log\left(\frac 1 \epsilon \right)^{\mathcal{O}(n)}}$ over a arbitrary fine grid, where the dimensionality $n$ only affects logarithmic terms. As direct corollaries, we provide tight sample complexity bounds and novel regret guarantees for learning fixed-price mechanisms in small markets, such as bilateral trade settings.