Efficient Latent Variable Causal Discovery: Combining Score Search and Targeted Testing

📅 2025-10-05

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Learning Partial Ancestral Graphs (PAGs) from observational data with latent variables or selection bias remains computationally expensive and statistically fragile. Method: We propose FCIT—a robust, efficient PAG learning framework built upon FCI, integrating BOSS-guided conditional independence (CI) testing to replace exhaustive CI queries, and the LV-Dumb heuristic for improved latent variable identification. FCIT unifies scoring-based search strategies (e.g., BOSS, GRaSP), directed CI testing, and domain-informed heuristic rules. Contribution/Results: BOSS-FCI and GRaSP-FCI serve as strong baselines; FCIT achieves speedups of several orders of magnitude over state-of-the-art methods while attaining higher structural recovery accuracy. LV-Dumb demonstrates superior practicality and generalizability on real-world benchmarks, enhancing both precision and stability in latent variable detection.

Technology Category

Application Category

📝 Abstract

Learning causal structure from observational data is especially challenging when latent variables or selection bias are present. The Fast Causal Inference (FCI) algorithm addresses this setting but often performs exhaustive conditional independence tests across many subsets, leading to spurious independence claims, extra or missing edges, and unreliable orientations. We present a family of score-guided mixed-strategy causal search algorithms that build on this tradition. First, we introduce BOSS-FCI and GRaSP-FCI, straightforward variants of GFCI that substitute BOSS or GRaSP for FGES, thereby retaining correctness while incurring different scalability tradeoffs. Second, we develop FCI Targeted-testing (FCIT), a novel mixed-strategy method that improves upon these variants by replacing exhaustive all-subsets testing with targeted tests guided by BOSS, yielding well-formed PAGs with higher precision and efficiency. Finally, we propose a simple heuristic, LV-Dumb (also known as BOSS-POD), which bypasses latent-variable-specific reasoning and directly returns the PAG of the BOSS DAG. Although not strictly correct in the FCI sense, it scales better and often achieves superior accuracy in practice. Simulations and real-data analyses demonstrate that BOSS-FCI and GRaSP-FCI provide sound baselines, FCIT improves both efficiency and reliability, and LV-Dumb offers a practical heuristic with strong empirical performance. Together, these method highlight the value of score-guided and targeted strategies for scalable latent-variable causal discovery.

Problem

Research questions and friction points this paper is trying to address.

Addressing latent variable and selection bias challenges in causal discovery

Improving FCI algorithm efficiency by reducing exhaustive conditional independence tests

Developing scalable methods for reliable causal structure learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Score-guided search substitutes BOSS or GRaSP for FGES

Targeted-testing replaces exhaustive tests with guided subsets

LV-Dumb heuristic bypasses latent reasoning for direct PAG output

🔎 Similar Papers

Score matching through the roof: linear, nonlinear, and latent variables causal discovery