Diagnosing Shortcut-Induced Rigidity in Continual Learning: The Einstellung Rigidity Index (ERI)

📅 2025-09-30

📈 Citations: 0

✨ Influential: 0

career value

198K/year

🤖 AI Summary

In continual learning (CL), models often exploit non-causal spurious features—e.g., artificial color patches—leading to “shortcut-induced rigidity”: early-learned weights over-rely on superficial correlations, impeding optimal solution discovery for new tasks, distinct from catastrophic forgetting. This work pioneers the integration of the cognitive science concept of the Einstellung effect into CL, proposing the interpretable, three-dimensional Einstellung Rigidity Index (ERI) to quantify adaptation delay (AD), performance deficit (PD), and relative reliance on suboptimal features (SFR_rel). Evaluated on a two-stage CIFAR-100 benchmark with confounding color patches, mainstream CL methods exhibit rapid convergence yet suboptimal final accuracy; notably, masking the patches improves performance—confirming pervasive shortcut dependence. ERI effectively discriminates genuine knowledge transfer from illusory performance gains, establishing a novel, rigorous paradigm for assessing CL robustness.

Technology Category

Application Category

📝 Abstract

Deep neural networks frequently exploit shortcut features, defined as incidental correlations between inputs and labels without causal meaning. Shortcut features undermine robustness and reduce reliability under distribution shifts. In continual learning (CL), the consequences of shortcut exploitation can persist and intensify: weights inherited from earlier tasks bias representation reuse toward whatever features most easily satisfied prior labels, mirroring the cognitive Einstellung effect, a phenomenon where past habits block optimal solutions. Whereas catastrophic forgetting erodes past skills, shortcut-induced rigidity throttles the acquisition of new ones. We introduce the Einstellung Rigidity Index (ERI), a compact diagnostic that disentangles genuine transfer from cue-inflated performance using three interpretable facets: (i) Adaptation Delay (AD), (ii) Performance Deficit (PD), and (iii) Relative Suboptimal Feature Reliance (SFR_rel). On a two-phase CIFAR-100 CL benchmark with a deliberately spurious magenta patch in Phase 2, we evaluate Naive fine-tuning (SGD), online Elastic Weight Consolidation (EWC_on), Dark Experience Replay (DER++), Gradient Projection Memory (GPM), and Deep Generative Replay (DGR). Across these continual learning methods, we observe that CL methods reach accuracy thresholds earlier than a Scratch-T2 baseline (negative AD) but achieve slightly lower final accuracy on patched shortcut classes (positive PD). Masking the patch improves accuracy for CL methods while slightly reducing Scratch-T2, yielding negative SFR_rel. This pattern indicates the patch acted as a distractor for CL models in this setting rather than a helpful shortcut.

Problem

Research questions and friction points this paper is trying to address.

Diagnosing shortcut-induced rigidity in continual learning systems

Measuring how past shortcuts block new skill acquisition

Disentangling genuine transfer from shortcut-inflated performance

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing Einstellung Rigidity Index diagnostic tool

Measuring rigidity via three interpretable quantitative facets

Evaluating shortcut reliance in continual learning methods

🔎 Similar Papers

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?