A Survey on Enhancing Causal Reasoning Ability of Large Language Models

📅 2025-03-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit limited performance on tasks demanding rigorous causal reasoning—such as those in healthcare and economics—due to their weak intrinsic causal modeling capabilities. To address the absence of a systematic survey in this domain, this work establishes the first structured research framework for enhancing causal reasoning in LLMs. We propose a unified four-dimensional taxonomy—spanning prompt-based, fine-tuning-based, architecture-based, and external-knowledge-integration approaches—and define standardized criteria for cross-method comparison. We comprehensively review existing evaluation benchmarks and metrics, exposing critical assessment biases and limitations. Furthermore, we synthesize key challenges and identify promising future directions, including interpretability enhancement, counterfactual reasoning modeling, and causal–linguistic joint pretraining. Covering the full technical spectrum—from prompting to architectural redesign—the survey serves as an authoritative reference and practical guide for integrating causal reasoning with LLMs.

Technology Category

Application Category

📝 Abstract
Large language models (LLMs) have recently shown remarkable performance in language tasks and beyond. However, due to their limited inherent causal reasoning ability, LLMs still face challenges in handling tasks that require robust causal reasoning ability, such as health-care and economic analysis. As a result, a growing body of research has focused on enhancing the causal reasoning ability of LLMs. Despite the booming research, there lacks a survey to well review the challenges, progress and future directions in this area. To bridge this significant gap, we systematically review literature on how to strengthen LLMs' causal reasoning ability in this paper. We start from the introduction of background and motivations of this topic, followed by the summarisation of key challenges in this area. Thereafter, we propose a novel taxonomy to systematically categorise existing methods, together with detailed comparisons within and between classes of methods. Furthermore, we summarise existing benchmarks and evaluation metrics for assessing LLMs' causal reasoning ability. Finally, we outline future research directions for this emerging field, offering insights and inspiration to researchers and practitioners in the area.
Problem

Research questions and friction points this paper is trying to address.

Enhancing causal reasoning in large language models
Addressing challenges in health-care and economic analysis
Reviewing methods, benchmarks, and future research directions
Innovation

Methods, ideas, or system contributions that make the work stand out.

Systematic review of LLMs' causal reasoning enhancement
Novel taxonomy for categorizing existing methods
Summary of benchmarks for evaluating causal reasoning
🔎 Similar Papers
No similar papers found.
X
Xin Li
University of Technology Sydney, Ultimo NSW 2007, Australia
Z
Zhuo Cai
University of Technology Sydney, Ultimo NSW 2007, Australia
Shoujin Wang
Shoujin Wang
University of Technology Sydney
Data ScienceMachine LearningRecommender SystemMisinformationData Science Application
K
Kun Yu
University of Technology Sydney, Ultimo NSW 2007, Australia
F
Fang Chen
University of Technology Sydney, Ultimo NSW 2007, Australia