Dejavu: Post-Deployment Learning for Embodied Agents via Experience Feedback

📅 2025-10-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Embodied agents struggle with continual learning post-deployment, hindering performance improvement through accumulated experience. This paper introduces Dejavu, the first framework enabling continual adaptation of frozen-weight vision-language-action (VLA) models in real-world environments. Dejavu employs an Experience Feedback Network (EFN) to retrieve historically successful trajectories and computes semantic rewards based on trajectory-level semantic similarity, dynamically augmenting policy outputs. Crucially, it avoids model fine-tuning by integrating incremental trajectory storage with efficient memory retrieval. Experiments across diverse embodied tasks demonstrate that Dejavu significantly improves task success rates, robustness to environmental perturbations, and zero-shot generalization—validating the feasibility and effectiveness of parameter-free, post-deployment continual learning for embodied AI.

Technology Category

Application Category

📝 Abstract
Embodied agents face a fundamental limitation: once deployed in real-world environments to perform specific tasks, they are unable to acquire new useful knowledge to enhance task performance. In this paper, we propose a general post-deployment learning framework called Dejavu, which employs an Experience Feedback Network (EFN) and augments the frozen Vision-Language-Action (VLA) policy with retrieved execution memories. EFN automatically identifies contextually successful prior action experiences and conditions action prediction on this retrieved guidance. We adopt reinforcement learning with semantic similarity rewards on EFN to ensure that the predicted actions align with past successful behaviors under current observations. During deployment, EFN continually enriches its memory with new trajectories, enabling the agent to exhibit "learning from experience" despite fixed weights. Experiments across diverse embodied tasks show that EFN significantly improves adaptability, robustness, and success rates over frozen baselines. These results highlight a promising path toward embodied agents that continually refine their behavior after deployment.
Problem

Research questions and friction points this paper is trying to address.

Enables embodied agents to learn from experience after deployment
Augments frozen policies with retrieved successful action memories
Improves adaptability and success rates across diverse embodied tasks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Experience Feedback Network augments frozen VLA policy
Reinforcement learning with semantic similarity rewards
Enriches memory with new trajectories during deployment
🔎 Similar Papers
No similar papers found.
Shaokai Wu
Shaokai Wu
School of Computer Science, Shanghai Jiao Tong University
Yanbiao Ji
Yanbiao Ji
Shanghai Jiao Tong University
Data Mining
Q
Qiuchang Li
School of Computer Science, Shanghai Jiao Tong University
Zhiyi Zhang
Zhiyi Zhang
School of Computer Science, Shanghai Jiao Tong University
Q
Qichen He
School of Computer Science, Shanghai Jiao Tong University
W
Wenyuan Xie
School of Computer Science, Shanghai Jiao Tong University
Guodong Zhang
Guodong Zhang
xAI
Machine Learning
B
Bayram Bayramli
School of Computer Science, Shanghai Jiao Tong University
Y
Yue Ding
School of Computer Science, Shanghai Jiao Tong University
Hongtao Lu
Hongtao Lu
Shanghai Jiao Tong university
Artificial intelligenceMachine LearningComputer Vision