Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the limited adaptability and long-horizon decision-making capabilities of large language model (LLM)-based agents in complex, realistic, interactive environments, this paper proposes the Generate-Execute-Feedback (GEF) loop framework. It is the first to systematically analyze the environment’s role—centered on the environment itself—across three phases: task generation, dynamic execution, and multi-granularity feedback. The work innovatively unifies fragmented environment extension approaches into a coherent analytical framework, integrating reinforcement learning paradigms, automated task generation, dynamic environment modeling, and rollout-based evaluation with feedback. We design multi-stage environment extension strategies and a corresponding benchmarking suite. Experiments demonstrate that the GEF framework significantly improves agents’ experience-efficient learning and long-term planning performance. It clarifies a scalable, environment-driven pathway for agent capability advancement, offering both theoretical foundations and practical guidance for embodied intelligence and autonomous agent research.

Technology Category

Application Category

📝 Abstract
LLM-based agents can autonomously accomplish complex tasks across various domains. However, to further cultivate capabilities such as adaptive behavior and long-term decision-making, training on static datasets built from human-level knowledge is insufficient. These datasets are costly to construct and lack both dynamism and realism. A growing consensus is that agents should instead interact directly with environments and learn from experience through reinforcement learning. We formalize this iterative process as the Generation-Execution-Feedback (GEF) loop, where environments generate tasks to challenge agents, return observations in response to agents'actions during task execution, and provide evaluative feedback on rollouts for subsequent learning. Under this paradigm, environments function as indispensable producers of experiential data, highlighting the need to scale them toward greater complexity, realism, and interactivity. In this survey, we systematically review representative methods for environment scaling from a pioneering environment-centric perspective and organize them along the stages of the GEF loop, namely task generation, task execution, and feedback. We further analyze benchmarks, implementation strategies, and applications, consolidating fragmented advances and outlining future research directions for agent intelligence.
Problem

Research questions and friction points this paper is trying to address.

Scaling environments to enhance LLM agent learning capabilities
Addressing limitations of static datasets through interactive environments
Developing complex realistic environments for agent training
Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes Generation-Execution-Feedback loop framework
Scales environments for complexity realism interactivity
Systematically reviews task generation execution feedback methods
🔎 Similar Papers
2023-08-22Frontiers Comput. Sci.Citations: 866