C-Flat++: Towards a More Efficient and Powerful Framework for Continual Learning

📅 2025-08-26
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In continual learning, models face the fundamental trade-off between adaptability to new tasks and stability of previously acquired knowledge; moreover, conventional zero-order sharpness-based optimization often converges to sharp minima, degrading generalization robustness. To address this, we propose C-Flat—the first loss landscape flatness optimization framework explicitly designed for continual learning—and its lightweight, efficient variant C-Flat++, compatible with mainstream paradigms including replay, regularization, and parameter isolation for plug-and-play integration. Our method integrates sharpness-aware principles with a selective flatness-driven update mechanism, requiring no architectural modifications and significantly reducing computational overhead. Extensive experiments demonstrate that C-Flat consistently improves both accuracy and stability across diverse benchmarks, algorithms, and continual learning scenarios. C-Flat++ achieves comparable performance with substantially reduced training cost, offering both theoretical rigor and practical deployability.

Technology Category

Application Category

📝 Abstract
Balancing sensitivity to new tasks and stability for retaining past knowledge is crucial in continual learning (CL). Recently, sharpness-aware minimization has proven effective in transfer learning and has also been adopted in continual learning (CL) to improve memory retention and learning efficiency. However, relying on zeroth-order sharpness alone may favor sharper minima over flatter ones in certain settings, leading to less robust and potentially suboptimal solutions. In this paper, we propose extbf{C}ontinual extbf{Flat}ness ( extbf{C-Flat}), a method that promotes flatter loss landscapes tailored for CL. C-Flat offers plug-and-play compatibility, enabling easy integration with minimal modifications to the code pipeline. Besides, we present a general framework that integrates C-Flat into all major CL paradigms and conduct comprehensive comparisons with loss-minima optimizers and flat-minima-based CL methods. Our results show that C-Flat consistently improves performance across a wide range of settings. In addition, we introduce C-Flat++, an efficient yet effective framework that leverages selective flatness-driven promotion, significantly reducing the update cost required by C-Flat. Extensive experiments across multiple CL methods, datasets, and scenarios demonstrate the effectiveness and efficiency of our proposed approaches. Code is available at https://github.com/WanNaa/C-Flat.
Problem

Research questions and friction points this paper is trying to address.

Balancing task sensitivity and knowledge stability in continual learning
Addressing suboptimal solutions from zeroth-order sharpness minimization
Reducing update costs while maintaining flat loss landscapes
Innovation

Methods, ideas, or system contributions that make the work stand out.

Promotes flatter loss landscapes for continual learning
Offers plug-and-play compatibility with minimal modifications
Leverages selective flatness-driven promotion to reduce update cost
W
Wei Li
College of Computer Science, Sichuan University, China
Hangjie Yuan
Hangjie Yuan
Alibaba DAMO | ZJU | MMLab@NTU
Generative ModelsMultimodal ModelsFoundation ModelsVideo Understanding
Zixiang Zhao
Zixiang Zhao
ETH Zürich
Computer VisionMachine LearningComputational Imaging
Yifan Zhu
Yifan Zhu
Beijing University of Posts and Telecommunications
PEFT of LLMsGraph RAGGraph mining
A
Aojun Lu
College of Computer Science, Sichuan University, China
T
Tao Feng
Department of Computer Science and Technology, Tsinghua University, China
Y
Yanan Sun
College of Computer Science, Sichuan University, China