🤖 AI Summary
This work addresses the challenge of causal discovery from multiple observational datasets characterized by non-overlapping variable sets, unobserved confounding, and missing variables. To tackle this problem, the authors propose a novel causal graph integration method based on Causal Additive Models with Unobserved Variables (CAM-UV). The approach first learns local causal structures from each dataset, explicitly accounting for latent confounders, and then integrates these partial structures into a unified global causal graph by enforcing structural consistency constraints across datasets. This is the first method capable of effectively fusing causal discovery results over non-aligned variable sets. Experimental evaluations demonstrate that the proposed method substantially outperforms existing approaches, particularly in complex scenarios involving strong confounding and substantial missingness, yielding more complete and accurate reconstructions of the true underlying causal structure.
📝 Abstract
Causal discovery from observational data is a fundamental tool in various fields of science. While existing approaches are typically designed for a single dataset, we often need to handle multiple datasets with non-identical variable sets in practice. One straightforward approach is to estimate a causal graph from each dataset and construct a single causal graph by overlapping. However, this approach identifies limited causal relationships because unobserved variables in each dataset can be confounders, and some variable pairs may be unobserved in any dataset. To address this issue, we leverage Causal Additive Models with Unobserved Variables (CAM-UV) that provide causal graphs having information related to unobserved variables. We show that the ground truth causal graph has structural consistency with the information of CAM-UV on each dataset. As a result, we propose an approach named I-CAM-UV to integrate CAM-UV results by enumerating all consistent causal graphs. We also provide an efficient combinatorial search algorithm and demonstrate the usefulness of I-CAM-UV against existing methods.