🤖 AI Summary
This study addresses the failure of conventional two-sample homogeneity tests in the presence of nonignorable nonresponse. To overcome this limitation, the authors propose a novel semi-parametric approach that innovatively incorporates reinterview data into two-sample testing. The method links the two population distributions via a density ratio model and jointly models the nonresponse mechanism using a semi-parametric framework. An empirical likelihood ratio test statistic is constructed and inference is carried out through an expectation–maximization-type algorithm. Both theoretical analysis and simulation studies demonstrate that the proposed procedure effectively controls Type I error and achieves substantially higher power compared to existing methods that ignore the missingness mechanism. The method’s practical utility and robustness are further confirmed through application to real-world income survey data.
📝 Abstract
Testing the homogeneity of two distributions is fundamental in statistics, but classical procedures may fail under nonignorable nonresponse. In many surveys, callback data record repeated contact attempts and provide auxiliary information about the response mechanism. We develop a semiparametric framework for two-sample homogeneity testing that explicitly incorporates such information. The response mechanism is modeled by a flexible semiparametric callback model, while the two population distributions are linked through a density ratio model. Within this unified framework, we propose an empirical likelihood ratio test for distributional homogeneity and show that, under the null hypothesis, it has a Wilks-type chi-square limit. To facilitate computation, we develop an efficient expectation-maximization-type algorithm. Simulation results show that the proposed method controls type I error well and achieves substantially higher power than existing methods that ignore nonignorable missingness. An application to real survey income data illustrates its practical value.