Causal Reasoning in Pieces: Modular In-Context Learning for Causal Discovery

📅 2025-07-31
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Large language models (LLMs) exhibit poor robustness in causal discovery tasks, degrading to near-random performance under data perturbations. To address this, we propose Modular Context Learning (MCL), a structured prompting framework inspired by chain-of-thought and tree-of-thought reasoning. MCL decomposes causal inference into four interpretable, sequential submodules: observable variable identification, correlation analysis, confounder discrimination, and causal graph generation. Evaluated on the Corr2Cause benchmark, MCL—when instantiated with OpenAI’s o-series and DeepSeek-R models—achieves nearly threefold accuracy improvement over conventional methods. We further demonstrate that reasoning chain length and structural complexity critically influence robustness. This work introduces the first modular architecture for LLM-based causal discovery, substantially enhancing both accuracy and perturbation resilience. It establishes a general, interpretable, and scalable framework for cross-domain causal inference.

Technology Category

Application Category

📝 Abstract
Causal inference remains a fundamental challenge for large language models. Recent advances in internal reasoning with large language models have sparked interest in whether state-of-the-art reasoning models can robustly perform causal discovery-a task where conventional models often suffer from severe overfitting and near-random performance under data perturbations. We study causal discovery on the Corr2Cause benchmark using the emergent OpenAI's o-series and DeepSeek-R model families and find that these reasoning-first architectures achieve significantly greater native gains than prior approaches. To capitalize on these strengths, we introduce a modular in-context pipeline inspired by the Tree-of-Thoughts and Chain-of-Thoughts methodologies, yielding nearly three-fold improvements over conventional baselines. We further probe the pipeline's impact by analyzing reasoning chain length, complexity, and conducting qualitative and quantitative comparisons between conventional and reasoning models. Our findings suggest that while advanced reasoning models represent a substantial leap forward, carefully structured in-context frameworks are essential to maximize their capabilities and offer a generalizable blueprint for causal discovery across diverse domains.
Problem

Research questions and friction points this paper is trying to address.

Challenges in causal inference for large language models
Overfitting and poor performance in conventional causal discovery models
Need for structured frameworks to enhance reasoning models' capabilities
Innovation

Methods, ideas, or system contributions that make the work stand out.

Modular in-context pipeline for causal discovery
Tree-of-Thoughts and Chain-of-Thoughts methodologies
Advanced reasoning models with structured frameworks
🔎 Similar Papers
No similar papers found.
K
Kacper Kadziolka
Leiden Institute of Advanced Computer Science (LIACS)
Saber Salehkaleybar
Saber Salehkaleybar
Leiden University
Causal InferenceStochastic OptimizationReinforcement Learning