Learning Local Causal World Models with State Space Models and Attention

📅 2025-05-04

📈 Citations: 0

✨ Influential: 0

career value

262K/year

🤖 AI Summary

Existing world models—particularly Transformer-based architectures—exhibit fundamental limitations in learning causal representations, especially in capturing local causal structures inherent in physical dynamics. Method: This paper proposes a local causal world modeling framework grounded in State Space Models (SSMs), the first to integrate SSMs into neural world modeling. Leveraging their implicit temporal inductive bias and low-rank dynamical modeling capacity, the approach explicitly discovers local causal dependencies in physical environments. The architecture synergistically combines SSMs, lightweight attention, and a differentiable causal discovery module to enable interpretable and generalizable dynamics prediction. Results: Experiments on standard physics simulation benchmarks demonstrate that the model matches or exceeds comparably sized Transformers in dynamic prediction accuracy, while substantially improving causal structure identification accuracy. These results validate the intrinsic suitability and promise of SSMs for causal world modeling.

Technology Category

Application Category

📝 Abstract

World modelling, i.e. building a representation of the rules that govern the world so as to predict its evolution, is an essential ability for any agent interacting with the physical world. Despite their impressive performance, many solutions fail to learn a causal representation of the environment they are trying to model, which would be necessary to gain a deep enough understanding of the world to perform complex tasks. With this work, we aim to broaden the research in the intersection of causality theory and neural world modelling by assessing the potential for causal discovery of the State Space Model (SSM) architecture, which has been shown to have several advantages over the widespread Transformer. We show empirically that, compared to an equivalent Transformer, a SSM can model the dynamics of a simple environment and learn a causal model at the same time with equivalent or better performance, thus paving the way for further experiments that lean into the strength of SSMs and further enhance them with causal awareness.

Problem

Research questions and friction points this paper is trying to address.

Learning causal world models for environment prediction

Comparing State Space Models and Transformers for causality

Enhancing causal discovery in neural world modelling

Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses State Space Models for causal discovery

Combines SSM with attention mechanisms

Learns local causal world models effectively

🔎 Similar Papers

No similar papers found.