ViReSkill: Vision-Grounded Replanning with Skill Memory for LLM-Based Planning in Lifelong Robot Learning

📅 2025-09-28
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing robots face key challenges in lifelong learning: slow task adaptation, planning lacking geometric and physical grounding, and unstable outputs from large language models (LLMs) or vision-language models (VLMs). To address these, we propose a vision-grounded replanning framework integrated with a reusable skill memory module. Our approach leverages VLMs to establish visual grounding of scene geometry and object physical properties, while employing LLMs for knowledge-informed high-level planning. A state-feedback-driven dynamic replanning mechanism enables robust recovery from execution failures, and a skill memory module consolidates successful experiences for cross-task transfer. Evaluated on LIBERO, RLBench, and real robotic platforms, our method significantly improves task success rates and generalization over state-of-the-art baselines. It establishes a reliable, adaptive, and sustainable closed-loop autonomous learning system for robots.

Technology Category

Application Category

📝 Abstract
Robots trained via Reinforcement Learning (RL) or Imitation Learning (IL) often adapt slowly to new tasks, whereas recent Large Language Models (LLMs) and Vision-Language Models (VLMs) promise knowledge-rich planning from minimal data. Deploying LLMs/VLMs for motion planning, however, faces two key obstacles: (i) symbolic plans are rarely grounded in scene geometry and object physics, and (ii) model outputs can vary for identical prompts, undermining execution reliability. We propose ViReSkill, a framework that pairs vision-grounded replanning with a skill memory for accumulation and reuse. When a failure occurs, the replanner generates a new action sequence conditioned on the current scene, tailored to the observed state. On success, the executed plan is stored as a reusable skill and replayed in future encounters without additional calls to LLMs/VLMs. This feedback loop enables autonomous continual learning: each attempt immediately expands the skill set and stabilizes subsequent executions. We evaluate ViReSkill on simulators such as LIBERO and RLBench as well as on a physical robot. Across all settings, it consistently outperforms conventional baselines in task success rate, demonstrating robust sim-to-real generalization.
Problem

Research questions and friction points this paper is trying to address.

Slow robot adaptation to new tasks in lifelong learning scenarios
Unreliable symbolic planning due to ungrounded geometry and physics
Inconsistent model outputs undermining execution reliability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-grounded replanning for scene adaptation
Skill memory for plan accumulation and reuse
Autonomous continual learning via feedback loop
🔎 Similar Papers
No similar papers found.
T
Tomoyuki Kagaya
Panasonic Connect Co., Ltd., Japan
S
Subramanian Lakshmi
Panasonic R&D Center, Singapore
Anbang Ye
Anbang Ye
HPC-AI Tech
Natural Language ProcessingMachine Learning
T
Thong Jing Yuan
Panasonic R&D Center, Singapore
J
Jayashree Karlekar
Panasonic R&D Center, Singapore
Sugiri Pranata
Sugiri Pranata
Panasonic R&D Center Singapore
N
Natsuki Murakami
Panasonic Connect Co., Ltd., Japan
A
Akira Kinose
Panasonic Connect Co., Ltd., Japan
Yang You
Yang You
Postdoc, Stanford University
3D visioncomputer graphicscomputational geometry