🤖 AI Summary
To address the challenges of limited replay memory capacity and insufficient sample diversity in class-incremental learning (CIL), this paper proposes an exemplar hyper-compression and regeneration framework leveraging pretrained diffusion models. Our method freezes a general-purpose diffusion model (e.g., Stable Diffusion) and encodes image exemplars losslessly into vision-language prompts—specifically CLIP embeddings—achieving a 24× memory reduction. We further introduce a prompt-level partial compression strategy and diffusion-process-guided synthesis augmentation to significantly narrow the domain gap between generated and real data. Evaluated on standard benchmarks including ImageNet-100, our approach outperforms state-of-the-art methods by 3.2% in average accuracy, substantially mitigates catastrophic forgetting, and simultaneously ensures high-fidelity exemplar regeneration and ultra-low storage overhead.
📝 Abstract
Replay-based methods in class-incremental learning~(CIL) have attained remarkable success. Despite their effectiveness, the inherent memory restriction results in saving a limited number of exemplars with poor diversity. In this paper, we introduce PESCR, a novel approach that substantially increases the quantity and enhances the diversity of exemplars based on a pre-trained general-purpose diffusion model, without fine-tuning it on target datasets or storing it in the memory buffer. Images are compressed into visual and textual prompts, which are saved instead of the original images, decreasing memory consumption by a factor of 24. In subsequent phases, diverse exemplars are regenerated by the diffusion model. We further propose partial compression and diffusion-based data augmentation to minimize the domain gap between generated exemplars and real images. Comprehensive experiments demonstrate that PESCR significantly improves CIL performance across multiple benchmarks, e.g., 3.2% above the previous state-of-the-art on ImageNet-100.