Program Evaluation with Remotely Sensed Outcomes

📅 2024-11-17

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

160K/year

🤖 AI Summary

When remote-sensing variables (e.g., satellite imagery, mobile activity) serve as proxies for economic outcomes in causal inference, conventional single-variable predictive substitution suffers from systematic bias due to temporal misalignment—specifically, when such proxies are “post-outcome” (i.e., endogenously generated by the very economic outcomes they aim to measure). Method: We propose a nonparametric causal identification framework grounded in cross-sample conditional distribution stability. It requires no assumptions on the convergence rate of remote-sensing prediction accuracy and enables direct application of high-dimensional models—including deep learning—under three empirically testable conditions. Contribution/Results: We provide the first rigorous proof of the bias arising from post-outcome proxies and introduce an efficient representation that circumvents it. Applied to a reanalysis of an Indian anti-poverty program, our method successfully corrects prior estimation bias, substantially expanding the credible use of complex machine learning models in policy evaluation.

Technology Category

Application Category

📝 Abstract

Economists often estimate treatment effects in experiments using remotely sensed variables (RSVs), e.g. satellite images or mobile phone activity, in place of directly measured economic outcomes. A common practice is to use an observational sample to train a predictor of the economic outcome from the RSV, and then to use its predictions as the outcomes in the experiment. We show that this method is biased whenever the RSV is post-outcome, i.e. if variation in the economic outcome causes variation in the RSV. In program evaluation, changes in poverty or environmental quality cause changes in satellite images, but not vice versa. As our main result, we nonparametrically identify the treatment effect by formalizing the intuition that underlies common practice: the conditional distribution of the RSV given the outcome and treatment is stable across the samples.Based on our identifying formula, we find that the efficient representation of RSVs for causal inference requires three predictions rather than one. Valid inference does not require any rate conditions on RSV predictions, justifying the use of complex deep learning algorithms with unknown statistical properties. We re-analyze the effect of an anti-poverty program in India using satellite images.

Problem

Research questions and friction points this paper is trying to address.

Bias in treatment effects using remotely sensed variables

Nonparametric identification of treatment effects with RSVs

Efficient RSV representation requires three predictions

Innovation

Methods, ideas, or system contributions that make the work stand out.

Nonparametric identification of treatment effects

Efficient RSV representation with three predictions

Deep learning without statistical rate conditions

🔎 Similar Papers

No similar papers found.

💼 Related Jobs

Machine Learning Research Engineer

Booz Allen Hamilton

$99,000.00 to $225,000.00 (annualized USD)

Remote

Research Scientist Intern, Multimodal Generative AI and Robotics (PhD)