🤖 AI Summary
This paper addresses a fundamental failure mode of the AIXI agent under the embedded intelligence paradigm: its structural collapse—manifesting as self-referential inconsistency, resource-agnosticism, and model self-destruction—arising from neglecting its own physical embedding and causal coupling with the environment.
Method: We provide the first axiomatic formalization of embedding failure in universal AI theory, grounded in Solomonoff’s prior and Bayesian decision theory. Using computability analysis and algorithmic information theory, we rigorously prove that such failure is inevitable under universal distribution assumptions. We further propose a novel distributional model over joint action-perception histories and construct a modified AIXI variant for theoretical validation.
Contribution/Results: Our work establishes the first falsifiable benchmark for embedded intelligence failure, delivering both a critical theoretical warning and a formal foundation for developing robust embedded AGI systems.
📝 Abstract
We rigorously discuss the commonly asserted failures of the AIXI reinforcement learning agent as a model of embedded agency. We attempt to formalize these failure modes and prove that they occur within the framework of universal artificial intelligence, focusing on a variant of AIXI that models the joint action/percept history as drawn from the universal distribution. We also evaluate the progress that has been made towards a successful theory of embedded agency based on variants of the AIXI agent.