🤖 AI Summary
This study addresses the challenge of reliably comparing small-area estimation models in subnational public health surveillance, where data sparsity and the absence of ground truth hinder evaluation. The authors propose a cross-validation framework tailored to complex survey designs that decomposes cross-validated squared error into identifiable bias and non-identifiable components, providing bounds for the latter. This approach enables model-free, robust comparisons between area-level and unit-level models at the regional scale. Theoretical analysis and simulations demonstrate that conventional leave-one-area-out cross-validation often yields misleading model rankings, whereas the proposed method substantially improves comparison reliability. The framework is successfully applied to spatial mapping of female literacy rates in Zambia, showcasing its practical utility in real-world settings.
📝 Abstract
Subnational monitoring of public health often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level models. Central to our framework is a decomposition of the cross-validated squared error in the context of SAE, which reveals both identifiable bias and unidentifiable components that can be bounded. Our theoretical results and simulation studies show that conventional approaches, such as leave-one-area-out cross-validation, can yield misleading model rankings, whereas the proposed approach offers more robust and interpretable model comparison with uncertainty quantification. We demonstrate the procedure for comparing SAE models for mapping the female literacy rate using Demographic and Health Surveys from Zambia.