🤖 AI Summary
To address severe catastrophic forgetting, stringent memory constraints, and high computational overhead of replay in online class-incremental learning, this paper proposes a parameter-free, gradient-free gridded image patch sampling method. It generates low-resolution memory samples via sparse pixel sampling, significantly enhancing information density per storage unit while preserving semantic and structural integrity. The core innovation is a lightweight sampling mechanism that eliminates both bilevel optimization and trainable models—enabling zero-parameter memory distillation. The method integrates seamlessly into mainstream replay frameworks without imposing additional training overhead. Evaluated on multiple benchmarks, it achieves an average 3–4% improvement in final accuracy, maintains identical memory footprint, and incurs negligible computational cost.
📝 Abstract
Online class-incremental learning aims to enable models to continuously adapt to new classes with limited access to past data, while mitigating catastrophic forgetting. Replay-based methods address this by maintaining a small memory buffer of previous samples, achieving competitive performance. For effective replay under constrained storage, recent approaches leverage distilled data to enhance the informativeness of memory. However, such approaches often involve significant computational overhead due to the use of bi-level optimization. Motivated by these limitations, we introduce Grid-based Patch Sampling (GPS), a lightweight and effective strategy for distilling informative memory samples without relying on a trainable model. GPS generates informative samples by sampling a subset of pixels from the original image, yielding compact low-resolution representations that preserve both semantic content and structural information. During replay, these representations are reassembled to support training and evaluation. Experiments on extensive benchmarks demonstrate that GRS can be seamlessly integrated into existing replay frameworks, leading to 3%-4% improvements in average end accuracy under memory-constrained settings, with limited computational overhead.