Hypothesizing Missing Causal Variables with LLMs

📅 2024-09-04
🏛️ arXiv.org
📈 Citations: 5
Influential: 0
📄 PDF

career value

225K/year
🤖 AI Summary
This study addresses the challenge of causal inference in observational studies—where confounding and mediation bias impede valid causal conclusions and randomized experiments are infeasible. We formally introduce the “Missing Causal Variable Identification” task: given an incomplete causal graph, identify the causal role (cause, effect, or mediator) of missing variables. Methodologically, we propose a structured causal graph prompting template for large language models (LLMs), integrating domain knowledge injection, and construct the first controllable-difficulty benchmark for causal role reasoning. Experiments reveal that LLMs significantly outperform baselines in mediator identification compared to cause/effect identification, with several open-source models surpassing GPT-4. Contributions include: (1) a formal definition of the new task; (2) empirical characterization of LLMs’ capabilities and limitations in causal role reasoning; and (3) open-sourcing of the benchmark and evaluation protocol to advance trustworthy causal AI.

Technology Category

Application Category

📝 Abstract
Scientific discovery is a catalyst for human intellectual advances, driven by the cycle of hypothesis generation, experimental design, data evaluation, and iterative assumption refinement. This process, while crucial, is expensive and heavily dependent on the domain knowledge of scientists to generate hypotheses and navigate the scientific cycle. Central to this is causality, the ability to establish the relationship between the cause and the effect. Motivated by the scientific discovery process, in this work, we formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph. We design a benchmark with varying difficulty levels and knowledge assumptions about the causal graph. With the growing interest in using Large Language Models (LLMs) to assist in scientific discovery, we benchmark open-source and closed models on our testbed. We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect. In contrast, they underperform in hypothesizing the cause and effect variables themselves. We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
Problem

Research questions and friction points this paper is trying to address.

Identifying backdoor variables in causal graphs
Completing partial causal graphs using contextual reasoning
Testing LLMs' hypothesis generation for causal inference
Innovation

Methods, ideas, or system contributions that make the work stand out.

Context-aware reasoning on parametric knowledge
Completing partial causal graphs as benchmark
Identifying backdoor variables through graph context