🤖 AI Summary
This work studies the query complexity of sampling from non-log-concave distributions $p(x) propto e^{-f(x)}$, focusing on challenging regimes where the target lacks favorable isoperimetric properties—e.g., non-uniform smoothness or non-convex support. Using information-theoretic lower bound constructions, $L$-log-smoothness analysis, trajectory characterization of the Ornstein–Uhlenbeck process, and mixture-of-Gaussians modeling, we establish the first tight exponential query complexity bound: $Theta(d)$. Key findings include: (i) the smoothness condition induced by Ornstein–Uhlenbeck trajectories is strictly stronger than that of $f$ itself; (ii) in high dimensions, sampling is exponentially slower than optimization—even by super-exponential factors; and (iii) polynomial-length Markov chains cannot achieve quasi-polynomial sampling across broad parameter regimes. Collectively, these results provide a unified characterization of the intrinsic hardness of non-log-concave sampling.
📝 Abstract
We study the problem of sampling from a $d$-dimensional distribution with density $p(x)propto e^{-f(x)}$, which does not necessarily satisfy good isoperimetric conditions. Specifically, we show that for any $L,M$ satisfying $LMge dge 5$, $epsilonin left{0,frac{1}{32}
ight}$, and any algorithm with query accesses to the value of $f(x)$ and $
abla f(x)$, there exists an $L$-log-smooth distribution with second moment at most $M$ such that the algorithm requires $left{frac{LM}{depsilon}
ight}^{Omega(d)}$ queries to compute a sample whose distribution is within $epsilon$ in total variation distance to the target distribution. We complement the lower bound with an algorithm requiring $left{frac{LM}{depsilon}
ight}^{mathcal O(d)}$ queries, thereby characterizing the tight (up to the constant in the exponent) query complexity for sampling from the family of non-log-concave distributions. Our results are in sharp contrast with the recent work of Huang et al. (COLT'24), where an algorithm with quasi-polynomial query complexity was proposed for sampling from a non-log-concave distribution when $M=mathtt{poly}(d)$. Their algorithm works under the stronger condition that all distributions along the trajectory of the Ornstein-Uhlenbeck process, starting from the target distribution, are $mathcal O(1)$-log-smooth. We investigate this condition and prove that it is strictly stronger than requiring the target distribution to be $mathcal O(1)$-log-smooth. Additionally, we study this condition in the context of mixtures of Gaussians. Finally, we place our results within the broader theme of ``sampling versus optimization'', as studied in Ma et al. (PNAS'19). We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.