CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators

πŸ“… 2026-05-09
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF

career value

184K/year
πŸ€– AI Summary
Large language models face limitations in causal reasoning due to the complexity of causal systems and the scarcity of executable ground-truth data. To address this, this work proposes CauSim, a novel framework that, for the first time, automatically constructs verifiable structural causal model (SCM) simulators from non-executable causal knowledge. The approach reframes causal reasoning as a scalable supervised learning problem by leveraging curriculum-based complexity progression, bidirectional translation between natural language and executable models, and domain-informed data augmentation. These mechanisms enable cross-representational generalization and model self-improvement. Experimental results demonstrate that CauSim substantially enhances performance across diverse causal reasoning tasks, empirically validating a positive relationship among simulator complexity, data scale, and model capability.
πŸ“ Abstract
Despite surpassing human performance across mathematics, coding, and other knowledge-intensive tasks, large language models (LLMs) continue to struggle with causal reasoning. A core obstacle is the target data itself: causal systems are complex and often expressed in non-executable forms, while ground-truth answers to causal queries are inherently scarce. We introduce CauSim, a framework that turns causal reasoning from a scarce-label problem into a scalable supervised one. CauSim constructs increasingly complex causal simulators: executable structural causal models (SCMs), incrementally built by LLMs, that scale to globally complex systems while maintaining verifiable answers to causal queries. CauSim operates across representations by formalizing non-executable causal knowledge into code, enabling data augmentation, and translating executable SCMs into natural language, enabling supervision in previously difficult-to-supervise representations. We structure our research into two parts: (1) how to construct increasingly complex causal simulators, and (2) a systematic study of what CauSim enables, demonstrating generalization across representations, consistent gains from curriculum scaling and data volume, LLM self-improvement through self-generated simulators, and data augmentation via formalization of existing domain knowledge.
Problem

Research questions and friction points this paper is trying to address.

causal reasoning
large language models
structural causal models
data scarcity
executable representations
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causal Reasoning
Structural Causal Models
Large Language Models
Data Augmentation
Executable Simulation