Elastic Weight Consolidation Done Right for Continual Learning

📅 2026-03-19

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

This work addresses a fundamental flaw in Elastic Weight Consolidation (EWC) and its variants for continual learning, where biased importance estimates—stemming from vanishing gradients and redundant protection in the Fisher Information Matrix (FIM)—exacerbate catastrophic forgetting. The study is the first to reveal this core deficiency in weight importance evaluation and introduces a Logits Reversal operation that effectively rectifies the FIM computation without requiring complex architectural modifications. This simple yet effective adjustment yields more accurate parameter importance estimates. Extensive experiments across diverse continual learning benchmarks demonstrate that the proposed method consistently outperforms original EWC and its extensions, achieving superior model performance while significantly mitigating catastrophic forgetting.

Technology Category

Application Category

📝 Abstract

Weight regularization methods in continual learning (CL) alleviate catastrophic forgetting by assessing and penalizing changes to important model weights. Elastic Weight Consolidation (EWC) is a foundational and widely used approach within this framework that estimates weight importance based on gradients. However, it has consistently shown suboptimal performance. In this paper, we conduct a systematic analysis of importance estimation in EWC from a gradient-based perspective. For the first time, we find that EWC's reliance on the Fisher Information Matrix (FIM) results in gradient vanishing and inaccurate importance estimation in certain scenarios. Our analysis also reveals that Memory Aware Synapses (MAS), a variant of EWC, imposes unnecessary constraints on parameters irrelevant to prior tasks, termed the redundant protection. Consequently, both EWC and its variants exhibit fundamental misalignments in estimating weight importance, leading to inferior performance. To tackle these issues, we propose the Logits Reversal (LR) operation, a simple yet effective modification that rectifies EWC's importance estimation. Specifically, reversing the logit values during the calculation of FIM can effectively prevent both gradient vanishing and redundant protection. Extensive experiments across various CL tasks and datasets show that the proposed method significantly outperforms existing EWC and its variants. Therefore, we refer to it as EWC Done Right (EWC-DR).

Problem

Research questions and friction points this paper is trying to address.

Continual Learning

Elastic Weight Consolidation

Catastrophic Forgetting

Fisher Information Matrix

Weight Importance Estimation

Innovation

Methods, ideas, or system contributions that make the work stand out.

Elastic Weight Consolidation

Continual Learning

Fisher Information Matrix