π€ AI Summary
This work addresses causal graph structure learning under hard interventions from multiple sources in the presence of latent variables, aiming to characterize intervention equivalence classesβi.e., distinct causal graphs that induce identical families of do-distributions. We propose the first systematic graphical characterization framework for such equivalence classes, grounded in do-calculus and d-separation theory, and establish sound edge orientation rules and graphical constraints. Our approach yields the first decidable criterion for intervention equivalence, unifying equivalence class characterization with structure learning. Furthermore, we design a provably sound hybrid algorithm that jointly leverages heterogeneous observational and interventional data for causal structure inference. Experiments demonstrate that our method significantly improves both accuracy and interpretability in identifying causal graphs under hard interventions.
π Abstract
A fundamental challenge in the empirical sciences involves uncovering causal structure through observation and experimentation. Causal discovery entails linking the conditional independence (CI) invariances in observational data to their corresponding graphical constraints via d-separation. In this paper, we consider a general setting where we have access to data from multiple experimental distributions resulting from hard interventions, as well as potentially from an observational distribution. By comparing different interventional distributions, we propose a set of graphical constraints that are fundamentally linked to Pearl's do-calculus within the framework of hard interventions. These graphical constraints associate each graphical structure with a set of interventional distributions that are consistent with the rules of do-calculus. We characterize the interventional equivalence class of causal graphs with latent variables and introduce a graphical representation that can be used to determine whether two causal graphs are interventionally equivalent, i.e., whether they are associated with the same family of hard interventional distributions, where the elements of the family are indistinguishable using the invariances from do-calculus. We also propose a learning algorithm to integrate multiple datasets from hard interventions, introducing new orientation rules. The learning objective is a tuple of augmented graphs which entails a set of causal graphs. We also prove the soundness of the proposed algorithm.