Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

📅 2026-03-04

📈 Citations: 0

✨ Influential: 0

career value

216K/year

🤖 AI Summary

This work addresses catastrophic forgetting in robotic continual learning, where acquiring new skills often leads to the loss of previously learned ones. The authors propose a strategy that leverages large-scale pretrained vision-language-action (VLA) models in conjunction with lightweight experience replay and behavioral cloning for policy learning. Their findings demonstrate that pretrained VLA models exhibit remarkable robustness against forgetting, effectively retaining prior knowledge and enabling rapid recovery of forgotten skills—even with an extremely small replay buffer. The approach achieves near-zero forgetting while preserving strong forward transfer capabilities, suggesting that large-scale pretraining fundamentally reshapes the dynamics of continual learning in embodied agents.

Technology Category

Application Category

📝 Abstract

Continual learning is a long-standing challenge in robot policy learning, where a policy must acquire new skills over time without catastrophically forgetting previously learned ones. While prior work has extensively studied continual learning in relatively small behavior cloning (BC) policy models trained from scratch, its behavior in modern large-scale pretrained Vision-Language-Action (VLA) models remains underexplored. In this work, we found that pretrained VLAs are remarkably resistant to forgetting compared with smaller policy models trained from scratch. Simple Experience Replay (ER) works surprisingly well on VLAs, sometimes achieving zero forgetting even with a small replay data size. Our analysis reveals that pretraining plays a critical role in downstream continual learning performance: large pretrained models mitigate forgetting with a small replay buffer size while maintaining strong forward learning capabilities. Furthermore, we found that VLAs can retain relevant knowledge from prior tasks despite performance degradation during learning new tasks. This knowledge retention enables rapid recovery of seemingly forgotten skills through finetuning. Together, these insights imply that large-scale pretraining fundamentally changes the dynamics of continual learning, enabling models to continually acquire new skills over time with simple replay. Code and more information can be found at https://ut-austin-rpl.github.io/continual-vla

Problem

Research questions and friction points this paper is trying to address.

continual learning

catastrophic forgetting

robot policy learning

Vision-Language-Action models

skill retention

Innovation

Methods, ideas, or system contributions that make the work stand out.

Vision-Language-Action Models

Continual Learning

Catastrophic Forgetting