🤖 AI Summary
This work addresses the question of what internal structure an agent must possess to achieve low-regret decision-making under uncertainty. By constructing a family of structured action-conditional prediction tasks, the paper establishes—without assuming optimality, determinism, or explicit models—that predictive internal states (such as beliefs or world models) are necessary for attaining low average regret. Methodologically, predictive modeling is recast as a binary “betting” decision, and through regret analysis combined with constraints on probability mass, the authors derive a requirement for discriminative power over high-marginal outcomes. The results show that in fully observable settings, the intervention transition kernel can be approximately recovered, while in partially observable environments, belief-like memory becomes indispensable, thereby providing a theoretical foundation for agent architecture design.
📝 Abstract
As artificial agents become increasingly capable, what internal structure is *necessary* for an agent to act competently under uncertainty? Classical results show that optimal control can be *implemented* using belief states or world models, but not that such representations are required. We prove quantitative "selection theorems" showing that low *average-case regret* on structured families of action-conditioned prediction tasks forces an agent to implement a predictive, structured internal state. Our results cover stochastic policies, partial observability, and evaluation under task distributions, without assuming optimality, determinism, or access to an explicit model. Technically, we reduce predictive modeling to binary "betting" decisions and show that regret bounds limit probability mass on suboptimal bets, enforcing the predictive distinctions needed to separate high-margin outcomes. In fully observed settings, this yields approximate recovery of the interventional transition kernel; under partial observability, it implies necessity of belief-like memory and predictive state, addressing an open question in prior world-model recovery work.