Temporal-Difference Variational Continual Learning

📅 2024-10-10

🏛️ arXiv.org

📈 Citations: 0

✨ Influential: 0

career value

170K/year

🤖 AI Summary

Catastrophic forgetting in continual learning (CL) remains challenging for Bayesian variational approaches, as recursive posterior updates accumulate approximation errors, hindering the balance between plasticity and stability. Method: We propose a novel variational objective incorporating multi-step historical posterior constraints—introducing temporal-difference ideas into Bayesian CL for the first time. Specifically, we explicitly regularize the divergence between the current posterior and posteriors from multiple previous tasks, thereby mitigating error accumulation and improving the trade-off between memory retention and task adaptability. Contribution/Results: Our method requires no replay buffers or architectural expansion. It achieves significant improvements over existing variational CL baselines across multiple standard benchmarks, effectively suppressing forgetting and enhancing generalization performance over long task sequences.

Technology Category

Application Category

📝 Abstract

Machine Learning models in real-world applications must continuously learn new tasks to adapt to shifts in the data-generating distribution. Yet, for Continual Learning (CL), models often struggle to balance learning new tasks (plasticity) with retaining previous knowledge (memory stability). Consequently, they are susceptible to Catastrophic Forgetting, which degrades performance and undermines the reliability of deployed systems. In the Bayesian CL literature, variational methods tackle this challenge by employing a learning objective that recursively updates the posterior distribution while constraining it to stay close to its previous estimate. Nonetheless, we argue that these methods may be ineffective due to compounding approximation errors over successive recursions. To mitigate this, we propose new learning objectives that integrate the regularization effects of multiple previous posterior estimations, preventing individual errors from dominating future posterior updates and compounding over time. We reveal insightful connections between these objectives and Temporal-Difference methods, a popular learning mechanism in Reinforcement Learning and Neuroscience. Experiments on challenging CL benchmarks show that our approach effectively mitigates Catastrophic Forgetting, outperforming strong Variational CL methods.

Problem

Research questions and friction points this paper is trying to address.

Balancing plasticity and memory stability in continual learning

Mitigating catastrophic forgetting in variational continual learning

Reducing compounding approximation errors in posterior updates

Innovation

Methods, ideas, or system contributions that make the work stand out.

Integrates multiple previous posterior estimations for regularization

Connects learning objectives to Temporal-Difference methods

Mitigates Catastrophic Forgetting in Continual Learning

🔎 Similar Papers

No similar papers found.