🤖 AI Summary
This study addresses the challenge of causal inference when the outcome variable is latent and can only be indirectly measured through multiple imperfect proxies. Conventional methods are vulnerable to measurement incomparability across studies and model misspecification. To overcome these limitations, the authors propose a design-oriented nonparametric framework that identifies and estimates the average treatment effect on the latent outcome under randomized experiments. The key innovation lies in constructing an identifiable nonparametric bridge function that flexibly accommodates differences in measurement systems across studies and nonlinear relationships among proxies, without imposing strong parametric assumptions on the measurement model. Coupled with a debiased estimation procedure, the proposed method substantially outperforms benchmarks such as principal component analysis and inverse covariance weighting in simulations, accurately recovering comparable and consistent causal effects on the latent variable while eliminating spurious cross-study heterogeneity.
📝 Abstract
How should researchers conduct causal inference when the outcome of interest is latent and measured imperfectly by multiple indicators? We develop a general nonparametric framework for identifying and estimating average treatment effects on latent outcomes in randomized experiments. We show that latent-outcome estimation faces two distinct noncomparability challenges. First, across studies, different measurement systems may cause estimators to target different empirical quantities even when the underlying latent treatment effect is the same. Second, within a study, different indicators may have different and possibly nonlinear relationships with the same latent outcome, making them not directly comparable. To address these challenges, we propose a design-based approach built around nonparametric bridge functions. We show that these bridge functions can be characterized and identified. Estimation relies on a debiasing procedure that permits valid inference even when the bridge functions are weakly identified. Simulations demonstrate that standard methods, such as principal components analysis and inverse covariance weighting, can generate spurious cross-study differences, whereas our approach recovers comparable latent treatment effects. Overall, the framework provides both a general strategy for causal inference with latent outcomes and practical guidance for designing measurements that support identification, comparability, and efficient estimation.