🤖 AI Summary
In settings where label acquisition is costly, classical two-sample goodness-of-fit tests—typically requiring fully labeled samples—are impractical.
Method: This paper introduces active learning into nonparametric two-sample testing for the first time, proposing a unified theoretical framework that jointly optimizes label efficiency and statistical validity. Our approach integrates U-statistic construction, bias correction, and an adaptive active querying strategy to maximize test power under strict label budget constraints.
Contribution/Results: We establish rigorous finite-sample control of Type-I error and derive asymptotic power guarantees. Empirical evaluation on real-world tasks—including medical controlled studies—demonstrates substantial improvements over conventional passive two-sample tests, achieving both statistical rigor and practical deployability without compromising interpretability or theoretical soundness.
📝 Abstract
Hypothesis testing is a statistical inference approach used to determine whether data supports a specific hypothesis. An important type is the two-sample test, which evaluates whether two sets of data points are from identical distributions. This test is widely used, such as by clinical researchers comparing treatment effectiveness. This tutorial explores two-sample testing in a context where an analyst has many features from two samples, but determining the sample membership (or labels) of these features is costly. In machine learning, a similar scenario is studied in active learning. This tutorial extends active learning concepts to two-sample testing within this extit{label-costly} setting while maintaining statistical validity and high testing power. Additionally, the tutorial discusses practical applications of these label-efficient two-sample tests.