🤖 AI Summary
This paper studies tolerant closeness testing of two distributions $P$ and $Q$ over the high-dimensional Boolean hypercube ${0,1}^n$, under the subcube conditional sampling (SUBCOND) model: distinguishing whether $|P - Q|_1 leq varepsilon_1$ or $geq varepsilon_2$, for $0 leq varepsilon_1 < varepsilon_2$. Prior work only addressed the non-tolerant case ($varepsilon_1 = 0$) with query complexity $widetilde{O}(n^{5}/varepsilon_2^{5})$. We propose the first general tolerant SUBCOND testing framework applicable to arbitrary $varepsilon_1 geq 0$, designing the first adjustable-tolerance efficient algorithm. Our method achieves query complexity $widetilde{O}(n^{3}/(varepsilon_2 - varepsilon_1)^{5})$, breaking the exponential lower bounds inherent in standard sampling models. This result establishes a new paradigm for distribution verification of high-dimensional samplers—simultaneously achieving theoretical optimality and practical feasibility.
📝 Abstract
We study the tolerant testing problem for high-dimensional samplers. Given as input two samplers $mathcal{P}$ and $mathcal{Q}$ over the $n$-dimensional space ${0,1}^n$, and two parameters $varepsilon_2>varepsilon_1$, the goal of tolerant testing is to test whether the distributions generated by $mathcal{P}$ and $mathcal{Q}$ are $varepsilon_1$-close or $varepsilon_2$-far. Since exponential lower bounds (in $n$) are known for the problem in the standard sampling model, research has focused on models where one can draw extit{conditional} samples. Among these models, extit{subcube conditioning} ($mathsf{SUBCOND}$), which allows conditioning on arbitrary subcubes of the domain, holds the promise of widespread adoption in practice owing to its ability to capture the natural behavior of samplers in constrained domains. To translate the promise into practice, we need to overcome two crucial roadblocks for tests based on $mathsf{SUBCOND}$: the prohibitively large number of queries ($ ilde{mathcal{O}}(n^5/varepsilon_2^5)$) and limitation to non-tolerant testing (i.e., $varepsilon_1 = 0$). The primary contribution of this work is to overcome the above challenges: we design a new tolerant testing methodology (i.e., $varepsilon_1 geq 0$) that allows us to significantly improve the upper bound to $ ilde{mathcal{O}}(n^3/(varepsilon_2-varepsilon_1)^5)$.