CauScientist: Teaching LLMs to Respect Data for Causal Discovery

📅 2026-01-20
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing causal discovery methods are often hindered by statistical indistinguishability, strong modeling assumptions, or unverified priors, limiting their ability to effectively integrate data-driven inference with domain knowledge. This work proposes CauScientist, a novel framework that, for the first time, leverages large language models (e.g., Qwen3-32B) as collaborative hypothesis generators and probabilistic validators. The approach employs hybrid graph initialization, iterative structural refinement, and an error-memory mechanism to guide the search process. CauScientist substantially advances the state of the art in discovering complex causal structures, achieving up to a 53.8% improvement in F1 score on 37-node graphs, increasing recall from 35.0% to 100.0%, and reducing structural Hamming distance by 44.0%, thereby demonstrating both high accuracy and robustness.

Technology Category

Application Category

📝 Abstract
Causal discovery is fundamental to scientific understanding and reliable decision-making. Existing approaches face critical limitations: purely data-driven methods suffer from statistical indistinguishability and modeling assumptions, while recent LLM-based methods either ignore statistical evidence or incorporate unverified priors that can mislead result. To this end, we propose CauScientist, a collaborative framework that synergizes LLMs as hypothesis-generating"data scientists"with probabilistic statistics as rigorous"verifiers". CauScientist employs hybrid initialization to select superior starting graphs, iteratively refines structures through LLM-proposed modifications validated by statistical criteria, and maintains error memory to guide efficient search space. Experiments demonstrate that CauScientist substantially outperforms purely data-driven baselines, achieving up to 53.8% F1 score improvement and enhancing recall from 35.0% to 100.0%. Notably, while standalone LLM performance degrades with graph complexity, CauScientist reduces structural hamming distance (SHD) by 44.0% compared to Qwen3-32B on 37-node graphs. Our project page is at https://github.com/OpenCausaLab/CauScientist.
Problem

Research questions and friction points this paper is trying to address.

causal discovery
large language models
statistical indistinguishability
unverified priors
data-driven methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

causal discovery
large language models
probabilistic verification
hybrid initialization
iterative refinement
🔎 Similar Papers
No similar papers found.