MLLM-CL: Continual Learning for Multimodal Large Language Models

📅 2025-06-05
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Multimodal large language models (MLLMs) face catastrophic forgetting and paradigm fragmentation when continually acquiring new knowledge and capabilities in dynamic real-world scenarios. Method: We introduce the first MLLM continual learning benchmark jointly supporting domain evolution and capability emergence. Our approach unifies IID domain continual learning and non-IID capability continual learning—two previously disjoint paradigms—via a parameter isolation mechanism and an MLLM-driven dynamic routing strategy. It further incorporates multi-stage incremental training and cross-modal knowledge consolidation to mitigate forgetting. Results: Experiments demonstrate an average accuracy improvement of 19.4% on both domain and capability continual learning tasks, alongside a 32.7% gain in knowledge integration efficiency. This work establishes a novel, reproducible paradigm and benchmark for MLLM continual learning.

Technology Category

Application Category

📝 Abstract
Recent Multimodal Large Language Models (MLLMs) excel in vision-language understanding but face challenges in adapting to dynamic real-world scenarios that require continuous integration of new knowledge and skills. While continual learning (CL) offers a potential solution, existing benchmarks and methods suffer from critical limitations. In this paper, we introduce MLLM-CL, a novel benchmark encompassing domain and ability continual learning, where the former focuses on independently and identically distributed (IID) evaluation across evolving mainstream domains, whereas the latter evaluates on non-IID scenarios with emerging model ability. Methodologically, we propose preventing catastrophic interference through parameter isolation, along with an MLLM-based routing mechanism. Extensive experiments demonstrate that our approach can integrate domain-specific knowledge and functional abilities with minimal forgetting, significantly outperforming existing methods.
Problem

Research questions and friction points this paper is trying to address.

Adapting MLLMs to dynamic real-world scenarios
Addressing limitations in continual learning benchmarks
Preventing catastrophic interference in MLLM updates
Innovation

Methods, ideas, or system contributions that make the work stand out.

Parameter isolation prevents catastrophic interference
MLLM-based routing mechanism enhances adaptability
Benchmark includes domain and ability continual learning
🔎 Similar Papers
H
Hongbo Zhao
Institute of Automation, Chinese Academy of Sciences (CASIA), University of Chinese Academy of Sciences (UCAS)
F
Fei Zhu
Centre for Artificial Intelligence and Robotics, HKISI, CAS, University of Chinese Academy of Sciences (UCAS)
Rundong Wang
Rundong Wang
Ph.D. student of Computer Science, Nanyang Technological University
artificial intelligencereinforcement learning
G
Gaofeng Meng
Institute of Automation, Chinese Academy of Sciences (CASIA), Centre for Artificial Intelligence and Robotics, HKISI, CAS, University of Chinese Academy of Sciences (UCAS)
Zhaoxiang Zhang
Zhaoxiang Zhang
Institute of Automation, Chinese Academy of Sciences
Computer VisionPattern RecognitionBiologically-inspired Learning