From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks?

📅 2024-07-29
🏛️ arXiv.org
📈 Citations: 11
Influential: 0
📄 PDF
🤖 AI Summary
This study investigates the reliability foundations of large language models’ (LLMs) causal discovery capability, addressing three key problems: memorization effects, contamination of pretraining corpora by spurious causal associations, and prediction inconsistency induced by minor contextual perturbations. We construct a benchmark suite of causal discovery queries and conduct empirical evaluations on open-source LLMs. Our analysis quantitatively establishes, for the first time, a positive correlation between the frequency of causal mentions in pretraining data and model performance—while also revealing strong context sensitivity: predictions for the same causal pair vary significantly across different contexts. Methodologically, we bridge corpus-level statistical features with causal reasoning ability, demonstrating that high-frequency, semantically consistent causal expressions constitute a critical prerequisite for LLMs to implicitly acquire robust causal knowledge. This work provides a data-driven pathway for enhancing LLMs’ causal robustness through targeted corpus curation and modeling interventions.

Technology Category

Application Category

📝 Abstract
Recent advances in artificial intelligence have seen Large Language Models (LLMs) demonstrate notable proficiency in causal discovery tasks. This study explores the factors influencing the performance of LLMs in causal discovery tasks. Utilizing open-source LLMs, we examine how the frequency of causal relations within their pre-training corpora affects their ability to accurately respond to causal discovery queries. Our findings reveal that a higher frequency of causal mentions correlates with better model performance, suggesting that extensive exposure to causal information during training enhances the models' causal discovery capabilities. Additionally, we investigate the impact of context on the validity of causal relations. Our results indicate that LLMs might exhibit divergent predictions for identical causal relations when presented in different contexts. This paper provides the first comprehensive analysis of how different factors contribute to LLM performance in causal discovery tasks.
Problem

Research questions and friction points this paper is trying to address.

Evaluating LLMs' ability to discover causal relations from data
Assessing memorization impact and generalization limits in causal discovery
Examining how incorrect training data affects causal reasoning confidence
Innovation

Methods, ideas, or system contributions that make the work stand out.

Investigates LLMs' causal discovery using OLMo and BLOOM
Examines memorization impact and incorrect relation effects
Analyzes contextual nuances in causal relation understanding
🔎 Similar Papers
No similar papers found.
T
Tao Feng
Monash University
L
Lizhen Qu
Monash University
Niket Tandon
Niket Tandon
Principal Researcher, Microsoft Research Bangalore
Commonsense ReasoningAI
Z
Zhuang Li
Monash University
Xiaoxi Kang
Xiaoxi Kang
Monash University
G
Gholamreza Haffari
Monash University