🤖 AI Summary
This work addresses the challenge of balancing stability and plasticity in continual learning under sequential data scenarios. Inspired by the modular organization of the human brain, the authors propose MoRe, a novel framework that constructs a theoretically identifiable hierarchical modular structure in representation space. MoRe decomposes knowledge into shared foundational modules and task-specific modules, enabling module reuse, alignment, and expansion. By leveraging temporal delayed dependencies to uncover intrinsic sequence structures and integrating modular learning with identifiability constraints, MoRe achieves structured knowledge organization and protection without requiring explicit task boundaries. Experiments on synthetic benchmarks and activation data from large language models demonstrate that MoRe learns interpretable hierarchical representations and significantly improves the stability-plasticity trade-off in continual learning.
📝 Abstract
Continual learning requires models to adapt to new data while preserving previously acquired knowledge. At its core, this challenge can be viewed as principled one-step adaptation: incorporating new information with minimal interference to existing representations. Most existing approaches address this challenge by modifying model parameters or architectures in a supervised, task-specific manner. However, the underlying issue is representational: tasks require distinct yet structured representations that can be selectively updated without disrupting representations, while structure should reflect intrinsic organization in the data rather than task boundaries. In sequential data, time-delayed dependencies provide a natural signal for uncovering this organization, revealing how fundamental representations give rise to more specific ones. Inspired by the modular organization of the human brain, we propose MoRe, a framework that identifies modularity in the representation itself rather than allocating it at the architectural level. MoRe decomposes knowledge into a hierarchy of fundamental and specific modules with identifiability guarantees, enabling principled module reuse, alignment, and expansion during adaptation while preserving old modules by construction. Experiments on synthetic benchmarks and real-world LLM activations demonstrate interpretable hierarchical structure, improved plasticity-stability trade-offs, suggesting MoRe as a principled foundation for continual adaptation