Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

📅 2025-11-12

📈 Citations: 0

✨ Influential: 0

career value

224K/year

🤖 AI Summary

To address the limited adaptability and long-horizon decision-making capabilities of large language model (LLM)-based agents in complex, realistic, interactive environments, this paper proposes the Generate-Execute-Feedback (GEF) loop framework. It is the first to systematically analyze the environment’s role—centered on the environment itself—across three phases: task generation, dynamic execution, and multi-granularity feedback. The work innovatively unifies fragmented environment extension approaches into a coherent analytical framework, integrating reinforcement learning paradigms, automated task generation, dynamic environment modeling, and rollout-based evaluation with feedback. We design multi-stage environment extension strategies and a corresponding benchmarking suite. Experiments demonstrate that the GEF framework significantly improves agents’ experience-efficient learning and long-term planning performance. It clarifies a scalable, environment-driven pathway for agent capability advancement, offering both theoretical foundations and practical guidance for embodied intelligence and autonomous agent research.

Technology Category

Application Category

📝 Abstract

LLM-based agents can autonomously accomplish complex tasks across various domains. However, to further cultivate capabilities such as adaptive behavior and long-term decision-making, training on static datasets built from human-level knowledge is insufficient. These datasets are costly to construct and lack both dynamism and realism. A growing consensus is that agents should instead interact directly with environments and learn from experience through reinforcement learning. We formalize this iterative process as the Generation-Execution-Feedback (GEF) loop, where environments generate tasks to challenge agents, return observations in response to agents'actions during task execution, and provide evaluative feedback on rollouts for subsequent learning. Under this paradigm, environments function as indispensable producers of experiential data, highlighting the need to scale them toward greater complexity, realism, and interactivity. In this survey, we systematically review representative methods for environment scaling from a pioneering environment-centric perspective and organize them along the stages of the GEF loop, namely task generation, task execution, and feedback. We further analyze benchmarks, implementation strategies, and applications, consolidating fragmented advances and outlining future research directions for agent intelligence.

Problem

Research questions and friction points this paper is trying to address.

Scaling environments to enhance LLM agent learning capabilities

Addressing limitations of static datasets through interactive environments

Developing complex realistic environments for agent training

Innovation

Methods, ideas, or system contributions that make the work stand out.

Proposes Generation-Execution-Feedback loop framework

Scales environments for complexity realism interactivity

Systematically reviews task generation execution feedback methods

🔎 Similar Papers

A Survey on Large Language Model based Autonomous Agents