🤖 AI Summary
Real-world relational data often contain unobserved confounders and violate both the i.i.d. assumption and causal sufficiency, rendering standard causal discovery methods invalid. To address this, we propose RelFCI—the first sound and complete causal discovery algorithm specifically designed for relational data with latent variables. Our contributions are threefold: (1) We introduce a novel relational causal graph model and rigorously define and prove the theoretical properties of relational d-separation; (2) We integrate FCI’s capability to handle latent confounders with RCD’s strength in modeling relational structure, enabling explicit representation of non-independent relational dependencies; (3) Extensive experiments demonstrate that RelFCI significantly outperforms baseline methods across diverse complex relational causal scenarios, achieving high accuracy and strong robustness. RelFCI thus provides both a verifiable theoretical foundation and a practical tool for causal inference in relational domains.
📝 Abstract
Estimating causal effects from real-world relational data can be challenging when the underlying causal model and potential confounders are unknown. While several causal discovery algorithms exist for learning causal models with latent confounders from data, they assume that the data is independent and identically distributed (i.i.d.) and are not well-suited for learning from relational data. Similarly, existing relational causal discovery algorithms assume causal sufficiency, which is unrealistic for many real-world datasets. To address this gap, we propose RelFCI, a sound and complete causal discovery algorithm for relational data with latent confounders. Our work builds upon the Fast Causal Inference (FCI) and Relational Causal Discovery (RCD) algorithms and it defines new graphical models, necessary to support causal discovery in relational domains. We also establish soundness and completeness guarantees for relational d-separation with latent confounders. We present experimental results demonstrating the effectiveness of RelFCI in identifying the correct causal structure in relational causal models with latent confounders.