🤖 AI Summary
Continual learning poses a fundamental challenge for artificial neural networks: balancing stability (retaining prior knowledge) against plasticity (acquiring new knowledge), hindered by catastrophic forgetting and memory rigidity. This paper introduces MESU, the first framework unifying synaptic plasticity modeling, Bayesian parameter uncertainty estimation, and Hessian-approximated regularization to achieve adaptive learning–forgetting trade-offs without explicit task boundaries. Leveraging weight-sampling-based inference and meta-plasticity mechanisms, MESU dynamically preserves critical knowledge and discards obsolete information in streaming data via uncertainty-driven parameter updates, while enabling out-of-distribution sample detection. Evaluated on 200 sequential permuted MNIST tasks, MESU significantly mitigates forgetting and surpasses state-of-the-art methods in both accuracy and scalability. In CIFAR-100 incremental learning, it maintains consistent performance gains and achieves high-confidence anomaly identification.
📝 Abstract
Biological synapses effortlessly balance memory retention and flexibility, yet artificial neural networks still struggle with the extremes of catastrophic forgetting and catastrophic remembering. Here, we introduce Metaplasticity from Synaptic Uncertainty (MESU), a Bayesian framework that updates network parameters according their uncertainty. This approach allows a principled combination of learning and forgetting that ensures that critical knowledge is preserved while unused or outdated information is gradually released. Unlike standard Bayesian approaches -- which risk becoming overly constrained, and popular continual-learning methods that rely on explicit task boundaries, MESU seamlessly adapts to streaming data. It further provides reliable epistemic uncertainty estimates, allowing out-of-distribution detection, the only computational cost being to sample the weights multiple times to provide proper output statistics. Experiments on image-classification benchmarks demonstrate that MESU mitigates catastrophic forgetting, while maintaining plasticity for new tasks. When training 200 sequential permuted MNIST tasks, MESU outperforms established continual learning techniques in terms of accuracy, capability to learn additional tasks, and out-of-distribution data detection. Additionally, due to its non-reliance on task boundaries, MESU outperforms conventional learning techniques on the incremental training of CIFAR-100 tasks consistently in a wide range of scenarios. Our results unify ideas from metaplasticity, Bayesian inference, and Hessian-based regularization, offering a biologically-inspired pathway to robust, perpetual learning.