Evolvable Embodied Agent for Robotic Manipulation via Long Short-Term Reflection and Optimization

📅 2026-04-15

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Traditional robotic approaches are limited in generalization, training efficiency, and interpretability, hindering their ability to continuously self-adapt through environmental feedback. This work proposes the Evolvable Embodied Agent (EEAgent) framework, which integrates large vision-language models (VLMs) for environmental perception and task planning, and introduces a Long- and Short-Term Reflection Optimization (LSTRO) mechanism. LSTRO dynamically fuses historical experiences with newly acquired knowledge to iteratively refine prompting strategies, enabling continual self-evolution of the agent. Evaluated on six tasks in the VIMA-Bench benchmark, the proposed method achieves a new state-of-the-art performance and significantly outperforms existing baselines in complex scenarios, demonstrating its effectiveness and advancement in enabling self-evolving embodied intelligence.

Technology Category

Application Category

📝 Abstract

Achieving general-purpose robotics requires empowering robots to adapt and evolve based on their environment and feedback. Traditional methods face limitations such as extensive training requirements, difficulties in cross-task generalization, and lack of interpretability. Prompt learning offers new opportunities for self-evolving robots without extensive training, but simply reflecting on past experiences.However, extracting meaningful insights from task successes and failures remains a challenge. To this end, we propose the evolvable embodied agent (EEAgent) framework, which leverages large vision-language models (VLMs) for better environmental interpretation and policy planning. To enhance reflection on past experiences, we propose a long short-term reflective optimization (LSTRO) mechanism that dynamically refines prompts based on both past experiences and newly learned lessons, facilitating continuous self-evolution, thereby enhancing overall task success rates. Evaluations on six VIMA-Bench tasks reveal that our approach sets a new state-of-the-art, notably outperforming baselines in complex scenarios.

Problem

Research questions and friction points this paper is trying to address.

robotic manipulation

self-evolution

experience reflection

cross-task generalization

prompt learning

Innovation

Methods, ideas, or system contributions that make the work stand out.

Evolvable Embodied Agent

Long Short-Term Reflective Optimization

Vision-Language Models