🤖 AI Summary
In Bayesian causal inference, sample-level and population-level estimands are frequently conflated, despite fundamental differences in their identification assumptions, modeling strategies, posterior marginalization procedures, and interpretive logic—specifically, sample-level estimation requires joint modeling across counterfactual worlds, whereas population-level estimation permits direct marginalization from the joint posterior over observed data without additional cross-world assumptions. Method: We propose a first-principles framework for rigorously specifying the target posterior marginal distribution, systematically exposing sources of common implementation biases. We validate the framework using Stan on synthetic data. Contribution/Results: The paper provides a complete computational workflow and implementation guide, illustrating—via concrete examples—how to avoid conflation, improve inferential accuracy and interpretability, and ensure methodological rigor. It establishes both a theoretical foundation and an operational paradigm for principled Bayesian causal inference.
📝 Abstract
Bayesian inference for causal estimands has been growing in popularity, however many misconceptions and implementation errors arise from conflating sample and population-level estimands. We have anecdotally witnessed these at conference talks, in the course of peer review service, and even in published and arXiv-ed papers. Our goal here is to elucidate the crucial differences between sample and population-level quantities when it comes to identification, modeling, Bayesian computation, and interpretation. For example, common sample-level estimands require cross-world assumptions for identification, whereas common population-level estimands do not. Similarly, the former requires explicit imputation of counterfactuals from their joint posterior, whereas the latter typically only requires a posterior distribution over parameters. We start by defining some examples of both types of estimands, then discuss the full joint posterior over all unknowns (both missing counterfactuals and population distribution parameters). We continue to outline how inference for different estimands are derived from different marginals of this joint posterior. Because the differences are conceptually subtle but can be practically substantial, we provide an illustration of using synthetic data in Stan. We also provide a detailed appendix with derivations and computational tips along with a discussion of common implementation errors. The overarching message here is to always engage in first-principles thinking about which marginal of the joint posterior is of interest in a particular application, then follow the strict logic of Bayes' theorem and probability to avoid common implementation errors.