🤖 AI Summary
This work addresses the challenge of accurately estimating causal effects in Gaussian linear structural causal models (GL-SCMs) under latent confounding and limited data, where over-parameterization impedes identifiability. To mitigate this issue, the authors propose the centered Gaussian linear structural causal model (CGL-SCM), which simplifies the model structure by standardizing exogenous variables while preserving the identifiability of causal effects. They further develop the first expectation-maximization (EM) algorithm tailored for parameter learning in CGL-SCMs with finite observational data. Theoretical analysis and empirical evaluations on synthetic datasets and standard causal graphs demonstrate that the proposed method accurately recovers the underlying causal distribution, thereby validating both its theoretical soundness and practical efficacy.
📝 Abstract
Estimating causal effects from observational data remains a fundamental challenge in causal inference, especially in the presence of latent confounders. This paper focuses on estimating causal effects in Gaussian Linear Structural Causal Models (GL-SCMs), which are widely used due to their analytical tractability. However, parameter estimation in GL-SCMs is often infeasible with finite data, primarily due to overparameterization. To address this, we introduce the class of Centralized Gaussian Linear SCMs (CGL-SCMs), a simplified yet expressive subclass where exogenous variables follow standardized distributions. We show that CGL-SCMs are equally expressive in terms of causal effect identifiability from observational distributions and present a novel EM-based estimation algorithm that can learn CGL-SCM parameters and estimate identifiable causal effects from finite observational samples. Our theoretical analysis is validated through experiments on synthetic data and benchmark causal graphs, demonstrating that the learned models accurately recover causal distributions.