🤖 AI Summary
This paper addresses the problem of causal identifiability failure—where conventional causal assumptions break down—by introducing *algorithmic causality* as an alternative formal definition of causality. Methodologically, it leverages algorithmic information theory to jointly optimize causal structure and symmetry during unsupervised compression across multiple environments, minimizing an upper bound on Kolmogorov complexity without requiring intervention labels or target priors. Theoretically, this optimization is proven to spontaneously yield robust causal representations. Empirical evaluation on synthetic and out-of-distribution datasets confirms its effectiveness. The core contributions are threefold: (i) the first formalization of algorithmic causality; (ii) the unification of data compression, causality, and symmetry within a single principled framework; and (iii) a theoretically grounded, interpretable account of how implicit causal structures may emerge in black-box systems—such as large language models—through compression-driven learning.
📝 Abstract
We explore the relationship between causality, symmetry, and compression. We build on and generalize the known connection between learning and compression to a setting where causal models are not identifiable. We propose a framework where causality emerges as a consequence of compressing data across multiple environments. We define algorithmic causality as an alternative definition of causality when traditional assumptions for causal identifiability do not hold. We demonstrate how algorithmic causal and symmetric structures can emerge from minimizing upper bounds on Kolmogorov complexity, without knowledge of intervention targets. We hypothesize that these insights may also provide a novel perspective on the emergence of causality in machine learning models, such as large language models, where causal relationships may not be explicitly identifiable.