🤖 AI Summary
This study addresses the issue of variance inflation in regression models under complex survey designs, which often arises from unnecessary variability in sampling weights. The authors propose a novel approach that, for the first time, integrates stabilized weights with generalized raking within a two-stage sampling framework, leveraging auxiliary covariate information to effectively reduce extraneous weight variation. This method substantially enhances the efficiency of design-based estimators while remaining compatible with standard statistical software. Simulation studies demonstrate that, under typical two-stage survey designs, the proposed estimator achieves markedly higher precision compared to existing methods. The approach has been successfully applied to a large-scale multinational study of Kaposi’s sarcoma, illustrating its practical utility and robustness in real-world settings.
📝 Abstract
In regression models fitted to data from complex survey designs, sampling weights often incorporate non-essential variation, inflating variance estimates. Stabilized weights mitigate this issue by adjusting sampling weights to account for variation explained by covariates. In the context of two-phase sampling, we evaluate the performance of optimal stabilized weights and propose combining the stabilized weight estimator with generalized raking, a class of efficient design-based estimators. This combination improves efficiency by reducing unnecessary weight variation and leveraging information from auxiliary variables. We show this combination can be implemented using the standard statistical package that handles two-phase samples and generalized raking. Simulation studies demonstrate that the proposed estimator enhances precision under realistic two-phase designs, though efficiency gains may be limited in highly informative designs. The developed methods were applied to a large multinational two-phase study of Kaposi sarcoma among people living with HIV.