Modeling Latent Selection with Structural Causal Models

📅 2024-01-12
🏛️ arXiv.org
📈 Citations: 4
Influential: 0
📄 PDF
🤖 AI Summary
Structural causal models (SCMs) lack formal theoretical foundations for modeling “latent selection”—a pervasive source of selection bias arising from unobserved selection mechanisms, which existing SCM theory fails to capture despite addressing latent confounding and causal cycles. Method: We systematically establish the causal modeling basis for latent selection by rigorously defining a causal conditioning operation within the SCM framework—preserving model simplicity, acyclicity, and linearity—and prove its commutativity with marginalization. Contribution/Results: This enables faithful abstraction of causal semantics under selection bias. Crucially, we generalize classical results on identifiability of interventional distributions and counterfactual reasoning to settings with latent selection, supporting rigorous causal inference in the abstracted model. Our framework significantly enhances the expressiveness and practical applicability of SCMs to real-world selection-bias problems, including sample self-selection and missing-data mechanisms.

Technology Category

Application Category

📝 Abstract
Selection bias is ubiquitous in real-world data, and can lead to misleading results if not dealt with properly. We introduce a conditioning operation on Structural Causal Models (SCMs) to model latent selection from a causal perspective. We show that the conditioning operation transforms an SCM with the presence of an explicit latent selection mechanism into an SCM without such selection mechanism, which partially encodes the causal semantics of the selected subpopulation according to the original SCM. Furthermore, we show that this conditioning operation preserves the simplicity, acyclicity, and linearity of SCMs, and commutes with marginalization. Thanks to these properties, combined with marginalization and intervention, the conditioning operation offers a valuable tool for conducting causal reasoning tasks within causal models where latent details have been abstracted away. We demonstrate by example how classical results of causal inference can be generalized to include selection bias and how the conditioning operation helps with modeling of real-world problems.
Problem

Research questions and friction points this paper is trying to address.

Develop theoretical foundation for modeling latent selection in Structural Causal Models
Extend graphical representations to encode latent selection beyond common causes
Establish when standard causal tools remain valid under selection bias
Innovation

Methods, ideas, or system contributions that make the work stand out.

Conditioning operation preserves causal semantics
Bidirected edges encode latent selection graphically
Abstraction streamlines analysis under selection bias
🔎 Similar Papers
No similar papers found.
L
Leihao Chen
Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Amsterdam, the Netherlands
O
O. Zoeter
Booking.com, The Netherlands
Joris M. Mooij
Joris M. Mooij
Professor in Mathematical Statistics, Korteweg-de Vries Institute, University of Amsterdam (NL)
Causal inferencegraphical modelingmachine learning