PANDA - Patch And Distribution-Aware Augmentation for Long-Tailed Exemplar-Free Continual Learning

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
In exemplar-free continual learning (EFCL), pretrained models suffer from severe catastrophic forgetting exacerbated by dual imbalances in real-world data streams—inter-task distributional shift and intra-task long-tailed (or reverse-skewed) class distributions. Method: We propose a region-aware, distribution-adaptive enhancement framework. First, we formally characterize and model this dual imbalance. Second, leveraging CLIP to localize semantic key regions, we perform cross-class patch transplantation to strengthen representations of few-shot classes. Third, we dynamically adjust sampling weights based on historical task distributions to achieve inter-task learning balance. The framework freezes the backbone, ensuring lightweight efficiency and compatibility with diverse pretrained models. Results: Our method achieves significant accuracy gains across mainstream EFCL benchmarks, markedly mitigates forgetting, and demonstrates superior robustness and generalization—particularly under long-tailed scenarios.

Technology Category

Application Category

📝 Abstract
Exemplar-Free Continual Learning (EFCL) restricts the storage of previous task data and is highly susceptible to catastrophic forgetting. While pre-trained models (PTMs) are increasingly leveraged for EFCL, existing methods often overlook the inherent imbalance of real-world data distributions. We discovered that real-world data streams commonly exhibit dual-level imbalances, dataset-level distributions combined with extreme or reversed skews within individual tasks, creating both intra-task and inter-task disparities that hinder effective learning and generalization. To address these challenges, we propose PANDA, a Patch-and-Distribution-Aware Augmentation framework that integrates seamlessly with existing PTM-based EFCL methods. PANDA amplifies low-frequency classes by using a CLIP encoder to identify representative regions and transplanting those into frequent-class samples within each task. Furthermore, PANDA incorporates an adaptive balancing strategy that leverages prior task distributions to smooth inter-task imbalances, reducing the overall gap between average samples across tasks and enabling fairer learning with frozen PTMs. Extensive experiments and ablation studies demonstrate PANDA's capability to work with existing PTM-based CL methods, improving accuracy and reducing catastrophic forgetting.
Problem

Research questions and friction points this paper is trying to address.

Addresses dual-level data imbalances in continual learning without exemplars
Amplifies low-frequency classes using patch transplantation from CLIP encoder
Reduces catastrophic forgetting in pre-trained model based continual learning
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses CLIP encoder to identify representative patch regions
Transplants patches from low-frequency to high-frequency classes
Leverages prior task distributions for adaptive balancing strategy
🔎 Similar Papers
No similar papers found.