CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators

📅 2026-05-09

📈 Citations: 0

✨ Influential: 0

career value

225K/year

🤖 AI Summary

Large language models face limitations in causal reasoning due to the complexity of causal systems and the scarcity of executable ground-truth data. To address this, this work proposes CauSim, a novel framework that, for the first time, automatically constructs verifiable structural causal model (SCM) simulators from non-executable causal knowledge. The approach reframes causal reasoning as a scalable supervised learning problem by leveraging curriculum-based complexity progression, bidirectional translation between natural language and executable models, and domain-informed data augmentation. These mechanisms enable cross-representational generalization and model self-improvement. Experimental results demonstrate that CauSim substantially enhances performance across diverse causal reasoning tasks, empirically validating a positive relationship among simulator complexity, data scale, and model capability.

📝 Abstract

Despite surpassing human performance across mathematics, coding, and other knowledge-intensive tasks, large language models (LLMs) continue to struggle with causal reasoning. A core obstacle is the target data itself: causal systems are complex and often expressed in non-executable forms, while ground-truth answers to causal queries are inherently scarce. We introduce CauSim, a framework that turns causal reasoning from a scarce-label problem into a scalable supervised one. CauSim constructs increasingly complex causal simulators: executable structural causal models (SCMs), incrementally built by LLMs, that scale to globally complex systems while maintaining verifiable answers to causal queries. CauSim operates across representations by formalizing non-executable causal knowledge into code, enabling data augmentation, and translating executable SCMs into natural language, enabling supervision in previously difficult-to-supervise representations. We structure our research into two parts: (1) how to construct increasingly complex causal simulators, and (2) a systematic study of what CauSim enables, demonstrating generalization across representations, consistent gains from curriculum scaling and data volume, LLM self-improvement through self-generated simulators, and data augmentation via formalization of existing domain knowledge.

Problem

Research questions and friction points this paper is trying to address.

causal reasoning

large language models

structural causal models

data scarcity

executable representations

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Reasoning

Structural Causal Models

Large Language Models