🤖 AI Summary
To address the human dependency and poor generalizability in reinforcement learning (RL) task curriculum design, this paper proposes the first fully automated RL curriculum generation framework powered by large language models (LLMs). Our method operates via a three-stage pipeline: (1) natural-language-based subtask decomposition; (2) end-to-end compilation of executable reward functions and goal-conditioned code; and (3) curriculum refinement driven by policy rollout trajectory evaluation. Crucially, we systematically integrate LLMs’ planning and code-generation capabilities into RL curriculum design—eliminating manual intervention while enabling cross-domain curriculum synthesis across manipulation, navigation, and humanoid locomotion. Experiments in diverse robotic simulation environments demonstrate substantial improvements in complex skill acquisition efficiency. Furthermore, policies generated by our framework successfully transfer to a real-world humanoid robot, validating effective sim-to-real deployment. This work establishes a scalable, domain-agnostic paradigm for automated curriculum learning in RL.
📝 Abstract
Curriculum learning is a training mechanism in reinforcement learning (RL) that facilitates the achievement of complex policies by progressively increasing the task difficulty during training. However, designing effective curricula for a specific task often requires extensive domain knowledge and human intervention, which limits its applicability across various domains. Our core idea is that large language models (LLMs), with their extensive training on diverse language data and ability to encapsulate world knowledge, present significant potential for efficiently breaking down tasks and decomposing skills across various robotics environments. Additionally, the demonstrated success of LLMs in translating natural language into executable code for RL agents strengthens their role in generating task curricula. In this work, we propose CurricuLLM, which leverages the high-level planning and programming capabilities of LLMs for curriculum design, thereby enhancing the efficient learning of complex target tasks. CurricuLLM consists of: (Step 1) Generating sequence of subtasks that aid target task learning in natural language form, (Step 2) Translating natural language description of subtasks in executable task code, including the reward code and goal distribution code, and (Step 3) Evaluating trained policies based on trajectory rollout and subtask description. We evaluate CurricuLLM in various robotics simulation environments, ranging from manipulation, navigation, and locomotion, to show that CurricuLLM can aid learning complex robot control tasks. In addition, we validate humanoid locomotion policy learned through CurricuLLM in real-world. Project website is https://iconlab.negarmehr.com/CurricuLLM/