Estimating Causal Effects in Gaussian Linear SCMs with Finite Data

📅 2026-01-08

🏛️ arXiv.org

📈 Citations: 1

✨ Influential: 0

career value

217K/year

🤖 AI Summary

This work addresses the challenge of accurately estimating causal effects in Gaussian linear structural causal models (GL-SCMs) under latent confounding and limited data, where over-parameterization impedes identifiability. To mitigate this issue, the authors propose the centered Gaussian linear structural causal model (CGL-SCM), which simplifies the model structure by standardizing exogenous variables while preserving the identifiability of causal effects. They further develop the first expectation-maximization (EM) algorithm tailored for parameter learning in CGL-SCMs with finite observational data. Theoretical analysis and empirical evaluations on synthetic datasets and standard causal graphs demonstrate that the proposed method accurately recovers the underlying causal distribution, thereby validating both its theoretical soundness and practical efficacy.

Technology Category

Application Category

📝 Abstract

Estimating causal effects from observational data remains a fundamental challenge in causal inference, especially in the presence of latent confounders. This paper focuses on estimating causal effects in Gaussian Linear Structural Causal Models (GL-SCMs), which are widely used due to their analytical tractability. However, parameter estimation in GL-SCMs is often infeasible with finite data, primarily due to overparameterization. To address this, we introduce the class of Centralized Gaussian Linear SCMs (CGL-SCMs), a simplified yet expressive subclass where exogenous variables follow standardized distributions. We show that CGL-SCMs are equally expressive in terms of causal effect identifiability from observational distributions and present a novel EM-based estimation algorithm that can learn CGL-SCM parameters and estimate identifiable causal effects from finite observational samples. Our theoretical analysis is validated through experiments on synthetic data and benchmark causal graphs, demonstrating that the learned models accurately recover causal distributions.

Problem

Research questions and friction points this paper is trying to address.

causal effect estimation

Gaussian Linear SCM

latent confounders

finite observational data

overparameterization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Centralized Gaussian Linear SCM

causal effect estimation

EM algorithm

finite observational data

causal identifiability

🔎 Similar Papers

Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery

2024-09-09arXiv.orgCitations: 0