🤖 AI Summary
Generative models suffer from catastrophic forgetting in continual learning. This paper systematically surveys recent advances in continual learning for large language models, multimodal models, vision-language-action models, and diffusion models. We propose the first brain-inspired memory mechanism-based taxonomy—comprising architectural, regularization-based, and replay-based methods—and establish their unified classification. We further develop a comprehensive analytical framework for generative models, covering training objectives, benchmarks, and backbone architectures, integrating techniques such as knowledge distillation, elastic weight consolidation, experience replay, parameter isolation, and gradient projection. A task- and domain-incremental evaluation protocol is designed to support rigorous assessment. The survey encompasses over 100 state-of-the-art works and introduces *Awesome-CL-in-GMs*, the first open-source repository dedicated to continual learning in generative models—providing both theoretical foundations and practical guidelines for algorithmic innovation and real-world deployment.
📝 Abstract
The rapid advancement of generative models has enabled modern AI systems to comprehend and produce highly sophisticated content, even achieving human-level performance in specific domains. However, these models remain fundamentally constrained by catastrophic forgetting - a persistent challenge where adapting to new tasks typically leads to significant degradation in performance on previously learned tasks. To address this practical limitation, numerous approaches have been proposed to enhance the adaptability and scalability of generative models in real-world applications. In this work, we present a comprehensive survey of continual learning methods for mainstream generative models, including large language models, multimodal large language models, vision language action models, and diffusion models. Drawing inspiration from the memory mechanisms of the human brain, we systematically categorize these approaches into three paradigms: architecture-based, regularization-based, and replay-based methods, while elucidating their underlying methodologies and motivations. We further analyze continual learning setups for different generative models, including training objectives, benchmarks, and core backbones, offering deeper insights into the field. The project page of this paper is available at https://github.com/Ghy0501/Awesome-Continual-Learning-in-Generative-Models.