🤖 AI Summary
To address catastrophic forgetting in unsupervised continual learning, this paper proposes an uncertainty-driven generative replay framework that requires no memory buffer, pseudo-labeling, or pretraining. Leveraging cluster-level uncertainty estimation and a dynamic thresholding mechanism, it guides a DeepSeek-R1-enhanced CLIP vision-language model to synthesize semantically coherent replay samples—mimicking biological memory replay and enabling unsupervised feature adaptation. On CIFAR-10, CIFAR-100, CINIC-10, SVHN, and TinyImageNet, the method achieves knowledge retention rates of 98.13%, 73.06%, 93.41%, 95.18%, and 59.74%, respectively—averaging 4.36% higher than state-of-the-art approaches. The core contribution is the first unsupervised generative replay paradigm regulated by uncertainty feedback, uniquely balancing scalability with semantic fidelity.
📝 Abstract
Continual Learning entails progressively acquiring knowledge from new data while retaining previously acquired knowledge, thereby mitigating ``Catastrophic Forgetting'' in neural networks. Our work presents a novel uncertainty-driven Unsupervised Continual Learning framework using Generative Replay, namely ``Replay to Remember (R2R)''. The proposed R2R architecture efficiently uses unlabelled and synthetic labelled data in a balanced proportion using a cluster-level uncertainty-driven feedback mechanism and a VLM-powered generative replay module. Unlike traditional memory-buffer methods that depend on pretrained models and pseudo-labels, our R2R framework operates without any prior training. It leverages visual features from unlabeled data and adapts continuously using clustering-based uncertainty estimation coupled with dynamic thresholding. Concurrently, a generative replay mechanism along with DeepSeek-R1 powered CLIP VLM produces labelled synthetic data representative of past experiences, resembling biological visual thinking that replays memory to remember and act in new, unseen tasks. Extensive experimental analyses are carried out in CIFAR-10, CIFAR-100, CINIC-10, SVHN and TinyImageNet datasets. Our proposed R2R approach improves knowledge retention, achieving a state-of-the-art performance of 98.13%, 73.06%, 93.41%, 95.18%, 59.74%, respectively, surpassing state-of-the-art performance by over 4.36%.