AXIOM: Learning to Play Games in Minutes with Expanding Object-Centric Models

📅 2025-05-30

📈 Citations: 0

✨ Influential: 0

career value

211K/year

🤖 AI Summary

Deep reinforcement learning (DRL) suffers from low sample efficiency, lack of prior guidance, and poor interpretability in low-data regimes. Method: We propose a generative world model integrating object-centric priors with active inference. It employs a dynamically expandable Bayesian mixture model, combining piecewise-linear dynamics modeling, online model growth, and Bayesian model reduction to quantify belief uncertainty and enable cross-task generalization—while maintaining a compact parameter count (significantly smaller than mainstream DRL models). Contribution/Results: This work is the first to embed object-centric representations within an active inference framework with support for online structural evolution. Experiments across multiple game environments demonstrate expert-level performance within just 10⁴ interaction steps, training time reduced to minutes, and simultaneous gains in data efficiency, interpretability, and lightweight deployability.

Technology Category

Application Category

📝 Abstract

Current deep reinforcement learning (DRL) approaches achieve state-of-the-art performance in various domains, but struggle with data efficiency compared to human learning, which leverages core priors about objects and their interactions. Active inference offers a principled framework for integrating sensory information with prior knowledge to learn a world model and quantify the uncertainty of its own beliefs and predictions. However, active inference models are usually crafted for a single task with bespoke knowledge, so they lack the domain flexibility typical of DRL approaches. To bridge this gap, we propose a novel architecture that integrates a minimal yet expressive set of core priors about object-centric dynamics and interactions to accelerate learning in low-data regimes. The resulting approach, which we call AXIOM, combines the usual data efficiency and interpretability of Bayesian approaches with the across-task generalization usually associated with DRL. AXIOM represents scenes as compositions of objects, whose dynamics are modeled as piecewise linear trajectories that capture sparse object-object interactions. The structure of the generative model is expanded online by growing and learning mixture models from single events and periodically refined through Bayesian model reduction to induce generalization. AXIOM masters various games within only 10,000 interaction steps, with both a small number of parameters compared to DRL, and without the computational expense of gradient-based optimization.

Problem

Research questions and friction points this paper is trying to address.

Improving data efficiency in deep reinforcement learning using object-centric priors

Combining active inference with DRL for flexible cross-task generalization

Enabling rapid game mastery with minimal parameters and no gradient optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates object-centric priors for efficient learning

Uses expanding generative models for online adaptation

Combines Bayesian efficiency with DRL generalization

🔎 Similar Papers

No similar papers found.