Score matching through the roof: linear, nonlinear, and latent variables causal discovery

📅 2024-07-26
🏛️ arXiv.org
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses causal discovery under two challenging scenarios: the presence of latent variables and violations of nonlinear assumptions. To tackle this, we propose a unified causal inference framework grounded in the score function ∇log p(X) of observed variables. Theoretically, we establish for the first time that, in additive noise models (ANMs), the causal mechanism is uniquely identifiable from the score function even under linearity alone—bypassing standard nonlinear requirements. We further derive novel identifiability conditions for characterizing causal equivalence classes and identifying direct causes in latent-variable settings. Methodologically, we integrate score matching, extended ANMs, and latent-variable graphical reasoning into a general-purpose causal discovery algorithm. Experiments demonstrate strong robustness and high accuracy across linear, nonlinear, and latent-variable systems—significantly relaxing the conventional assumptions of full observability and strong nonlinearity, thereby broadening the applicability of causal discovery.

Technology Category

Application Category

📝 Abstract
Causal discovery from observational data holds great promise, but existing methods rely on strong assumptions about the underlying causal structure, often requiring full observability of all relevant variables. We tackle these challenges by leveraging the score function $ abla log p(X)$ of observed variables for causal discovery and propose the following contributions. First, we fine-tune the existing identifiability results with the score on additive noise models, showing that their assumption of nonlinearity of the causal mechanisms is not necessary. Second, we establish conditions for inferring causal relations from the score even in the presence of hidden variables; this result is two-faced: we demonstrate the score's potential to infer the equivalence class of causal graphs with hidden variables (while previous results are restricted to the fully observable setting), and we provide sufficient conditions for identifying direct causes in latent variable models. Building on these insights, we propose a flexible algorithm suited for causal discovery on linear, nonlinear, and latent variable models, which we empirically validate.
Problem

Research questions and friction points this paper is trying to address.

Relaxing strong assumptions in causal discovery methods
Enabling causal inference with hidden variables
Proposing a flexible algorithm for diverse causal models
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses score function for causal discovery
Relaxes nonlinearity assumption in models
Handles hidden variables in causal graphs
🔎 Similar Papers
No similar papers found.