🤖 AI Summary
Deep reinforcement learning (DRL) suffers from low sample efficiency, lack of prior guidance, and poor interpretability in low-data regimes. Method: We propose a generative world model integrating object-centric priors with active inference. It employs a dynamically expandable Bayesian mixture model, combining piecewise-linear dynamics modeling, online model growth, and Bayesian model reduction to quantify belief uncertainty and enable cross-task generalization—while maintaining a compact parameter count (significantly smaller than mainstream DRL models). Contribution/Results: This work is the first to embed object-centric representations within an active inference framework with support for online structural evolution. Experiments across multiple game environments demonstrate expert-level performance within just 10⁴ interaction steps, training time reduced to minutes, and simultaneous gains in data efficiency, interpretability, and lightweight deployability.
📝 Abstract
Current deep reinforcement learning (DRL) approaches achieve state-of-the-art performance in various domains, but struggle with data efficiency compared to human learning, which leverages core priors about objects and their interactions. Active inference offers a principled framework for integrating sensory information with prior knowledge to learn a world model and quantify the uncertainty of its own beliefs and predictions. However, active inference models are usually crafted for a single task with bespoke knowledge, so they lack the domain flexibility typical of DRL approaches. To bridge this gap, we propose a novel architecture that integrates a minimal yet expressive set of core priors about object-centric dynamics and interactions to accelerate learning in low-data regimes. The resulting approach, which we call AXIOM, combines the usual data efficiency and interpretability of Bayesian approaches with the across-task generalization usually associated with DRL. AXIOM represents scenes as compositions of objects, whose dynamics are modeled as piecewise linear trajectories that capture sparse object-object interactions. The structure of the generative model is expanded online by growing and learning mixture models from single events and periodically refined through Bayesian model reduction to induce generalization. AXIOM masters various games within only 10,000 interaction steps, with both a small number of parameters compared to DRL, and without the computational expense of gradient-based optimization.