Active Continual Learning with Metaplastic Binary Bayesian Neural Networks

📅 2026-05-28

📈 Citations: 0

✨ Influential: 0

career value

179K/year

🤖 AI Summary

This work addresses the challenges faced by edge devices in continual learning, where limited computational resources exacerbate catastrophic forgetting, model saturation, and degraded predictive reliability. To tackle these issues without relying on a replay buffer, the authors propose BiMU—a novel approach that leverages bounded-memory variational inference and a non-degenerate binary Bayesian posterior to dynamically balance stability, plasticity, and forgetting in an online setting. Key innovations include controlled prior relaxation, uncertainty-adaptive step sizes, and an active querying mechanism based on Monte Carlo divergence. Empirical results demonstrate that BiMU maintains strong continual learning performance and out-of-distribution detection capabilities across 1,000 tasks on Permuted-MNIST, while on OpenLORIS-Object it achieves comparable accuracy with up to a 32-fold reduction in label requests and backpropagation updates.

📝 Abstract

Always-on edge systems must keep learning as conditions change under tight compute budgets and must detect unreliable predictions. Bayesian binary neural networks are attractive in this setting, but mean-field Bernoulli posteriors can saturate on long non-stationary streams, wiping out epistemic uncertainty and freezing plasticity. We propose BiMU, derived from a bounded-memory variational objective that balances stability, plasticity, and forgetting. BiMU combines a data term with controlled relaxation toward the prior and an uncertainty-dependent step size that prevents saturation and sustains informative uncertainty. This non-degenerate posterior enables fully online, buffer-free active querying via Monte Carlo disagreement, reducing label queries and backpropagation updates under imbalance. BiMU sustains learning and strong OOD detection on 1000-tasks Permuted-MNIST, and on OpenLORIS-Object achieves up to 32$\times$ label/update savings at matched accuracy under class imbalance and feature compression.

Problem

Research questions and friction points this paper is trying to address.

continual learning

Bayesian neural networks

epistemic uncertainty

edge systems

non-stationary streams

Innovation

Methods, ideas, or system contributions that make the work stand out.

Continual Learning

Bayesian Neural Networks

Binary Weights