Program Evaluation with Remotely Sensed Outcomes

📅 2024-11-17
🏛️ arXiv.org
📈 Citations: 1
Influential: 0
📄 PDF
🤖 AI Summary
When remote-sensing variables (e.g., satellite imagery, mobile activity) serve as proxies for economic outcomes in causal inference, conventional single-variable predictive substitution suffers from systematic bias due to temporal misalignment—specifically, when such proxies are “post-outcome” (i.e., endogenously generated by the very economic outcomes they aim to measure). Method: We propose a nonparametric causal identification framework grounded in cross-sample conditional distribution stability. It requires no assumptions on the convergence rate of remote-sensing prediction accuracy and enables direct application of high-dimensional models—including deep learning—under three empirically testable conditions. Contribution/Results: We provide the first rigorous proof of the bias arising from post-outcome proxies and introduce an efficient representation that circumvents it. Applied to a reanalysis of an Indian anti-poverty program, our method successfully corrects prior estimation bias, substantially expanding the credible use of complex machine learning models in policy evaluation.

Technology Category

Application Category

📝 Abstract
Economists often estimate treatment effects in experiments using remotely sensed variables (RSVs), e.g. satellite images or mobile phone activity, in place of directly measured economic outcomes. A common practice is to use an observational sample to train a predictor of the economic outcome from the RSV, and then to use its predictions as the outcomes in the experiment. We show that this method is biased whenever the RSV is post-outcome, i.e. if variation in the economic outcome causes variation in the RSV. In program evaluation, changes in poverty or environmental quality cause changes in satellite images, but not vice versa. As our main result, we nonparametrically identify the treatment effect by formalizing the intuition that underlies common practice: the conditional distribution of the RSV given the outcome and treatment is stable across the samples.Based on our identifying formula, we find that the efficient representation of RSVs for causal inference requires three predictions rather than one. Valid inference does not require any rate conditions on RSV predictions, justifying the use of complex deep learning algorithms with unknown statistical properties. We re-analyze the effect of an anti-poverty program in India using satellite images.
Problem

Research questions and friction points this paper is trying to address.

Bias in treatment effects using remotely sensed variables
Nonparametric identification of treatment effects with RSVs
Efficient RSV representation requires three predictions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric identification of treatment effects
Efficient RSV representation with three predictions
Deep learning without statistical rate conditions
🔎 Similar Papers
No similar papers found.