π€ AI Summary
This work addresses catastrophic forgetting in continual learning by proposing NESS, a method that leverages singular value decomposition to identify an approximate null space within the weight space of a neural network. New task updates are constrained to low-rank modifications confined to this subspace, thereby implicitly preserving knowledge from previous tasks without requiring explicit gradient projection. By training only a single matrix, NESS efficiently enforces orthogonality constraints in an implicit manner, enabling effective adaptation to new tasks while safeguarding previously acquired knowledge. Evaluated on three standard benchmarks, NESS demonstrates competitive performance, achieving consistently low forgetting rates and stable cross-task accuracy across sequential learning scenarios.
π Abstract
Alleviating catastrophic forgetting while enabling further learning is a primary challenge in continual learning (CL). Orthogonal-based training methods have gained attention for their efficiency and strong theoretical properties, and many existing approaches enforce orthogonality through gradient projection. In this paper, we revisit orthogonality and exploit the fact that small singular values correspond to directions that are nearly orthogonal to the input space of previous tasks. Building on this principle, we introduce NESS (Null-space Estimated from Small Singular values), a CL method that applies orthogonality directly in the weight space rather than through gradient manipulation. Specifically, NESS constructs an approximate null space using the smallest singular values of each layer's input representation and parameterizes task-specific updates via a compact low-rank adaptation (LoRA-style) formulation constrained to this subspace. The subspace basis is fixed to preserve the null-space constraint, and only a single trainable matrix is learned for each task. This design ensures that the resulting updates remain approximately in the null space of previous inputs while enabling adaptation to new tasks. Our theoretical analysis and experiments on three benchmark datasets demonstrate competitive performance, low forgetting, and stable accuracy across tasks, highlighting the role of small singular values in continual learning. The code is available at https://github.com/pacman-ctm/NESS.