π€ AI Summary
Learning Partial Ancestral Graphs (PAGs) from observational data with latent variables or selection bias remains computationally expensive and statistically fragile.
Method: We propose FCITβa robust, efficient PAG learning framework built upon FCI, integrating BOSS-guided conditional independence (CI) testing to replace exhaustive CI queries, and the LV-Dumb heuristic for improved latent variable identification. FCIT unifies scoring-based search strategies (e.g., BOSS, GRaSP), directed CI testing, and domain-informed heuristic rules.
Contribution/Results: BOSS-FCI and GRaSP-FCI serve as strong baselines; FCIT achieves speedups of several orders of magnitude over state-of-the-art methods while attaining higher structural recovery accuracy. LV-Dumb demonstrates superior practicality and generalizability on real-world benchmarks, enhancing both precision and stability in latent variable detection.
π Abstract
Learning causal structure from observational data is especially challenging when latent variables or selection bias are present. The Fast Causal Inference (FCI) algorithm addresses this setting but often performs exhaustive conditional independence tests across many subsets, leading to spurious independence claims, extra or missing edges, and unreliable orientations. We present a family of score-guided mixed-strategy causal search algorithms that build on this tradition. First, we introduce BOSS-FCI and GRaSP-FCI, straightforward variants of GFCI that substitute BOSS or GRaSP for FGES, thereby retaining correctness while incurring different scalability tradeoffs. Second, we develop FCI Targeted-testing (FCIT), a novel mixed-strategy method that improves upon these variants by replacing exhaustive all-subsets testing with targeted tests guided by BOSS, yielding well-formed PAGs with higher precision and efficiency. Finally, we propose a simple heuristic, LV-Dumb (also known as BOSS-POD), which bypasses latent-variable-specific reasoning and directly returns the PAG of the BOSS DAG. Although not strictly correct in the FCI sense, it scales better and often achieves superior accuracy in practice. Simulations and real-data analyses demonstrate that BOSS-FCI and GRaSP-FCI provide sound baselines, FCIT improves both efficiency and reliability, and LV-Dumb offers a practical heuristic with strong empirical performance. Together, these method highlight the value of score-guided and targeted strategies for scalable latent-variable causal discovery.