🤖 AI Summary
Current agents exhibit limited generalization when confronted with out-of-distribution environmental changes, such as shifts in interaction rules, dynamics, or observation feedback. This work proposes “environment expansion”—enhancing cross-environment generalization by broadening the distribution of executable rule sets an agent interacts with, rather than merely increasing the number of trajectories or tasks. We formally distinguish trajectory expansion, task expansion, and environment expansion, establishing a unified taxonomy and highlighting that distributional expansion at the environment level is essential for robust, general-purpose agents. Scalable environments are constructed via two paradigms: procedural generators and generative world models, integrated with state-aware learning mechanisms to enable cross-environment adaptation. This study provides a theoretical framework and technical pathway toward measurable and controllable general agents, significantly improving their adaptability and robustness in unseen environments.
📝 Abstract
Generalizable agents should adapt to diverse tasks and unseen environments beyond their training distribution. This position paper argues that such generalization requires environment scaling: expanding the distribution of executable rule-sets that agents interact with, rather than only increasing trajectories or tasks within fixed benchmarks. Current scaling practices largely focus on collecting more experience or broader task sets under fixed interaction rules, leaving agents brittle when underlying interfaces, dynamics, observations, or feedback signals change. The core challenge is therefore a world-level distribution shift: agents need systematic exposure to environments with meaningfully different executable rule-sets. To clarify this challenge, we propose a unified taxonomy that separates trajectory scaling, task scaling, and environment scaling by their primary deliverables and by what changes in the executable rule-set. Building on this taxonomy, we synthesize construction paradigms for scalable environments, contrasting programmatic generators that prioritize controllability and verifiability with generative world models that offer broader coverage and open-endedness. We further outline how environment scaling can be coupled with stateful learning mechanisms, emphasizing learned update rules for cross-environment adaptation. We conclude by discussing alternative perspectives and argue that scalable environments provide the essential substrate for measurable and controllable progress toward robust general agents.