The Sample Complexity of Simple Binary Hypothesis Testing

📅 2024-03-25

🏛️ Annual Conference Computational Learning Theory

📈 Citations: 7

✨ Influential: 0

career value

234K/year

🤖 AI Summary

This paper investigates the sample complexity of simple binary hypothesis testing—i.e., the minimum number of i.i.d. samples required to distinguish two distributions $p$ and $q$—under both minimax (controlling type-I and type-II error rates $alpha,eta$) and Bayesian (with prior weight $pi$ and total error probability $delta$) settings, providing a unified characterization. For the first time, it establishes exact asymptotic formulas for the sample complexity—up to distribution-independent constant factors—valid for arbitrary $alpha,eta leq 1/8$ and $delta leq alpha/4$, thus removing restrictive assumptions such as $alpha = eta$ or $pi = 1/2$ prevalent in prior work. The analysis hinges on $f$-divergence theory, with a key technical innovation being tight inequalities linking the Hellinger and Jensen–Shannon divergence families. The resulting bounds admit equivalent formulations in terms of either divergence measure, yielding foundational insights for robust hypothesis testing, locally differentially private inference, and communication-constrained distributed hypothesis testing.

Technology Category

Application Category

📝 Abstract

The sample complexity of simple binary hypothesis testing is the smallest number of i.i.d. samples required to distinguish between two distributions $p$ and $q$ in either: (i) the prior-free setting, with type-I error at most $alpha$ and type-II error at most $eta$; or (ii) the Bayesian setting, with Bayes error at most $delta$ and prior distribution $(alpha, 1-alpha)$. This problem has only been studied when $alpha = eta$ (prior-free) or $alpha = 1/2$ (Bayesian), and the sample complexity is known to be characterized by the Hellinger divergence between $p$ and $q$, up to multiplicative constants. In this paper, we derive a formula that characterizes the sample complexity (up to multiplicative constants that are independent of $p$, $q$, and all error parameters) for: (i) all $0 le alpha, eta le 1/8$ in the prior-free setting; and (ii) all $delta le alpha/4$ in the Bayesian setting. In particular, the formula admits equivalent expressions in terms of certain divergences from the Jensen--Shannon and Hellinger families. The main technical result concerns an $f$-divergence inequality between members of the Jensen--Shannon and Hellinger families, which is proved by a combination of information-theoretic tools and case-by-case analyses. We explore applications of our results to robust and distributed (locally-private and communication-constrained) hypothesis testing.

Problem

Research questions and friction points this paper is trying to address.

Determines sample complexity for binary hypothesis testing

Characterizes error bounds in prior-free and Bayesian settings

Uses Jensen-Shannon and Hellinger divergences for analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

Characterizes sample complexity for all error parameters

Uses Jensen-Shannon and Hellinger divergences

Applies f-divergence inequality with case analyses

🔎 Similar Papers

A New Upper Bound for Distributed Hypothesis Testing Using the Auxiliary Receiver Approach