CuES: A Curiosity-driven and Environment-grounded Synthesis Framework for Agentic RL

📅 2025-12-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In task-scarce settings, agentic reinforcement learning (RL) is hindered by the absence of predefined, structured tasks—limiting policy training and generalization. Method: This paper formally introduces the “task generation” problem and proposes an environment-grounded, intrinsically motivated framework for autonomous task generation. It eschews manual task design, instead leveraging interactive abstraction, memory-augmented quality control, lightweight top-down guidance, and inductive evolution of reusable task patterns to automatically construct diverse, executable, and semantically coherent tasks. Results: Experiments on AppWorld, BFCL, and WebShop demonstrate that generated tasks match or surpass human-annotated benchmarks in diversity and executability, while substantially improving downstream policy learning performance—thereby breaking the dependency of agentic RL on handcrafted task specifications.

Technology Category

Application Category

📝 Abstract
Large language model based agents are increasingly deployed in complex, tool augmented environments. While reinforcement learning provides a principled mechanism for such agents to improve through interaction, its effectiveness critically depends on the availability of structured training tasks. In many realistic settings, however, no such tasks exist a challenge we term task scarcity, which has become a key bottleneck for scaling agentic RL. Existing approaches typically assume predefined task collections, an assumption that fails in novel environments where tool semantics and affordances are initially unknown. To address this limitation, we formalize the problem of Task Generation for Agentic RL, where an agent must learn within a given environment that lacks predefined tasks. We propose CuES, a Curiosity driven and Environment grounded Synthesis framework that autonomously generates diverse, executable, and meaningful tasks directly from the environment structure and affordances, without relying on handcrafted seeds or external corpora. CuES drives exploration through intrinsic curiosity, abstracts interaction patterns into reusable task schemas, and refines them through lightweight top down guidance and memory based quality control. Across three representative environments, AppWorld, BFCL, and WebShop, CuES produces task distributions that match or surpass manually curated datasets in both diversity and executability, yielding substantial downstream policy improvements. These results demonstrate that curiosity driven, environment grounded task generation provides a scalable foundation for agents that not only learn how to act, but also learn what to learn. The code is available at https://github.com/modelscope/AgentEvolver/research/CuES.
Problem

Research questions and friction points this paper is trying to address.

Generates tasks for agentic RL without predefined tasks
Addresses task scarcity in novel tool-augmented environments
Autonomously creates diverse executable tasks from environment structure
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generates tasks from environment structure without predefined seeds
Uses intrinsic curiosity to drive exploration and task creation
Abstracts interaction patterns into reusable task schemas
🔎 Similar Papers
No similar papers found.