LLM-Guided Task- and Affordance-Level Exploration in Reinforcement Learning

๐Ÿ“… 2025-09-20
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Reinforcement learning (RL) for robotic manipulation suffers from low sample efficiency and difficulty exploring high-dimensional state-action spaces. While existing large language model (LLM)-guided approaches improve semantic coherence, they often neglect physical feasibility, yielding unreliable policies. This paper proposes a two-tier LLM-guided framework: (1) task-level goal decomposition and (2) operability-level enforcement of kinematic and dynamic constraints, augmented by an online behavior refinement mechanism that dynamically corrects LLM-generated action proposals. The method enables multimodal, unsupervised, high-quality exploration without human annotation. Evaluated on standard grasp-and-place RL benchmarks, it achieves significant improvements in sample efficiency (+42%) and success rate (+31%). Furthermore, it demonstrates zero-shot sim-to-real transfer on a physical robot, validating its generalizability and practical utility.

Technology Category

Application Category

๐Ÿ“ Abstract
Reinforcement learning (RL) is a promising approach for robotic manipulation, but it can suffer from low sample efficiency and requires extensive exploration of large state-action spaces. Recent methods leverage the commonsense knowledge and reasoning abilities of large language models (LLMs) to guide exploration toward more meaningful states. However, LLMs can produce plans that are semantically plausible yet physically infeasible, yielding unreliable behavior. We introduce LLM-TALE, a framework that uses LLMs' planning to directly steer RL exploration. LLM-TALE integrates planning at both the task level and the affordance level, improving learning efficiency by directing agents toward semantically meaningful actions. Unlike prior approaches that assume optimal LLM-generated plans or rewards, LLM-TALE corrects suboptimality online and explores multimodal affordance-level plans without human supervision. We evaluate LLM-TALE on pick-and-place tasks in standard RL benchmarks, observing improvements in both sample efficiency and success rates over strong baselines. Real-robot experiments indicate promising zero-shot sim-to-real transfer. Code and supplementary material are available at https://llm-tale.github.io.
Problem

Research questions and friction points this paper is trying to address.

Improving sample efficiency in robotic manipulation reinforcement learning
Addressing physically infeasible plans generated by large language models
Enabling autonomous correction of suboptimal plans during RL exploration
Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-guided task and affordance level planning
Online correction of suboptimal plans autonomously
Multimodal affordance exploration without human supervision
๐Ÿ”Ž Similar Papers
No similar papers found.
J
Jelle Luijkx
Cognitive Robotics, Delft University of Technology, The Netherlands
R
Runyu Ma
Cognitive Robotics, Delft University of Technology, The Netherlands
Zlatan Ajanoviฤ‡
Zlatan Ajanoviฤ‡
RWTH Aachen University, ex: TU Delft, TU Graz, UNSA
AI and RoboticsSearchExploration in RLTask and Motion PlanningOptimal Control
Jens Kober
Jens Kober
Associate Professor, CoR, TU Delft
RoboticsMachine Learning