On the Structure of Replicable Hypothesis Testers

📅 2025-07-03

📈 Citations: 0

✨ Influential: 0

career value

213K/year

🤖 AI Summary

This paper studies replicable hypothesis testing—requiring two independent samples from the same distribution to yield identical decisions with high probability. The central problem is to characterize tight sample complexity bounds and design algorithms that are accurate, computationally efficient, and satisfy natural desiderata such as symmetry. The authors introduce a normalized replicable tester framework, proving that any replicable test is equivalent to applying a deterministic function to a test statistic with a randomized threshold—thereby fully resolving the open question of label invariance in symmetric tests posed by Liu–Ye. Leveraging statistics with known expectation and bounded variance, they develop a low-overhead transformation from non-replicable to replicable algorithms via randomized thresholds. Their approach achieves optimal or state-of-the-art sample complexity for uniformity, identity, closeness, and Gaussian mean testing; several tasks attain constant-factor optimality. All algorithms run in polynomial time.

Technology Category

Application Category

📝 Abstract

A hypothesis testing algorithm is replicable if, when run on two different samples from the same distribution, it produces the same output with high probability. This notion, defined by by Impagliazzo, Lei, Pitassi, and Sorell [STOC'22], can increase trust in testing procedures and is deeply related to algorithmic stability, generalization, and privacy. We build general tools to prove lower and upper bounds on the sample complexity of replicable testers, unifying and quantitatively improving upon existing results. We identify a set of canonical properties, and prove that any replicable testing algorithm can be modified to satisfy these properties without worsening accuracy or sample complexity. A canonical replicable algorithm computes a deterministic function of its input (i.e., a test statistic) and thresholds against a uniformly random value in $[0,1]$. It is invariant to the order in which the samples are received, and, if the testing problem is ``symmetric,'' then the algorithm is also invariant to the labeling of the domain elements, resolving an open question by Liu and Ye [NeurIPS'24]. We prove new lower bounds for uniformity, identity, and closeness testing by reducing to the case where the replicable algorithm satisfies these canonical properties. We systematize and improve upon a common strategy for replicable algorithm design based on test statistics with known expectation and bounded variance. Our framework allow testers which have been extensively analyzed in the non-replicable setting to be made replicable with minimal overhead. As direct applications of our framework, we obtain constant-factor optimal bounds for coin testing and closeness testing and get replicability for free in a large parameter regime for uniformity testing. We also give state-of-the-art bounds for replicable Gaussian mean testing, and, unlike prior work, our algorithm runs in polynomial time.

Problem

Research questions and friction points this paper is trying to address.

Develop tools for replicable hypothesis testers' sample complexity bounds

Identify and enforce canonical properties in replicable testing algorithms

Improve replicable algorithm design with minimal overhead

Innovation

Methods, ideas, or system contributions that make the work stand out.

General tools for replicable tester bounds

Canonical properties for replicable algorithms

Framework for minimal overhead replicability

🔎 Similar Papers

Generalizability of experimental studies