🤖 AI Summary
This paper addresses the rank verification problem—determining whether the observed top-ranked unit corresponds to the true maximum mean—under heteroscedastic Gaussian distributions. It introduces the first systematic selective inference framework for this task, circumventing power loss from conventional multiple testing correction. Methodologically, it unifies Top-K set validation and partial order identification via conditional hypothesis testing and exact p-value construction. Compared to standard approaches, it achieves substantially higher statistical power and accuracy while preserving rigorous type-I error control, as empirically validated on NHANES real-world data. Key contributions are: (1) the first theoretical framework and algorithm for rank verification under heteroscedasticity; (2) integration of selective inference principles to relax the homoscedasticity assumption; and (3) an open-source, reproducible, and interpretable software package implementing the method.
📝 Abstract
Statistical experiments often seek to identify random variables with the largest population means. This inferential task, known as rank verification, has been well-studied on Gaussian data with equal variances. This work provides the first treatment of the unequal variances case, utilizing ideas from the selective inference literature. We design a hypothesis test that verifies the rank of the largest observed value without losing power due to multiple testing corrections. This test is subsequently extended for two procedures: Identifying some number of correctly-ordered Gaussian means, and validating the top-K set. The testing procedures are validated on NHANES survey data.