🤖 AI Summary
This work addresses the problem of efficiently estimating the partition function given only samples from a proposal distribution and access to the unnormalized density ratio of the target distribution. To this end, it introduces the novel concept of an “integral coverage profile,” which characterizes the concentration of the target distribution’s mass in regions of high density ratio. A general information-theoretic framework is established, relying solely on the f-divergence between the proposal and target distributions, and applicable to broad settings including heavy-tailed distributions. Under minimal assumptions, the framework rigorously distinguishes the complexity of approximate sampling from that of counting. By combining a generalized Paley–Zygmund inequality with importance sampling analysis, the paper derives tight upper and lower bounds on the sample complexity for multiplicative estimation of the partition function, unifying and extending classical results on importance sampling, rejection sampling, and heavy-tailed mean estimation, while providing sharper finite-sample guarantees.
📝 Abstract
We study the statistical complexity of estimating partition functions given sample access to a proposal distribution and an unnormalized density ratio for a target distribution. While partition function estimation is a classical problem, existing guarantees typically rely on structural assumptions about the domain or model geometry. We instead provide a general, information-theoretic characterization that depends only on the relationship between the proposal and target distributions. Our analysis introduces the integrated coverage profile, a functional that quantifies how much target mass lies in regions where the density ratio is large. We show that integrated coverage tightly characterizes the sample complexity of multiplicative partition function estimation and provide matching lower bounds. We further express these bounds in terms of $f$-divergences, yielding sharp phase transitions depending on the growth rate of f and recovering classical results as a special case while extending to heavy-tailed regimes. Matching lower bounds establish tightness in all regimes. As applications, we derive improved finite-sample guarantees for importance sampling and self-normalized importance sampling, and we show a strict separation between the complexity of approximate sampling and counting under the same divergence constraints. Our results unify and generalize prior analyses of importance sampling, rejection sampling, and heavy-tailed mean estimation, providing a minimal-assumption theory of partition function estimation. Along the way we introduce new technical tools including new connections between coverage and $f$-divergences as well as a generalization of the classical Paley-Zygmund inequality.