PDAC: Efficient Coreset Selection for Continual Learning via Probability Density Awareness

📅 2025-11-12
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Core-set construction in continual learning typically relies on bi-level optimization, incurring prohibitive computational overhead. Method: This paper proposes a probability-density-aware core-set selection method that avoids multi-level optimization by leveraging a novel insight—derived from local error decomposition—that sample probability density correlates positively with model error suppression capability. Accordingly, it introduces a density-prioritized selection mechanism. Furthermore, it develops a streaming EM algorithm integrated with a projected Gaussian mixture model to estimate the joint data density online, enabling efficient incremental core-set maintenance under dynamic data streams. Contribution/Results: Extensive experiments across diverse continual learning settings demonstrate that the proposed method significantly outperforms mainstream baselines in both accuracy and efficiency: it achieves competitive or superior performance while drastically reducing computational cost, thereby unifying effectiveness and efficiency.

Technology Category

Application Category

📝 Abstract
Rehearsal-based Continual Learning (CL) maintains a limited memory buffer to store replay samples for knowledge retention, making these approaches heavily reliant on the quality of the stored samples. Current Rehearsal-based CL methods typically construct the memory buffer by selecting a representative subset (referred to as coresets), aiming to approximate the training efficacy of the full dataset with minimal storage overhead. However, mainstream Coreset Selection (CS) methods generally formulate the CS problem as a bi-level optimization problem that relies on numerous inner and outer iterations to solve, leading to substantial computational cost thus limiting their practical efficiency. In this paper, we aim to provide a more efficient selection logic and scheme for coreset construction. To this end, we first analyze the Mean Squared Error (MSE) between the buffer-trained model and the Bayes-optimal model through the perspective of localized error decomposition to investigate the contribution of samples from different regions to MSE suppression. Further theoretical and experimental analyses demonstrate that samples with high probability density play a dominant role in error suppression. Inspired by this, we propose the Probability Density-Aware Coreset (PDAC) method. PDAC leverages the Projected Gaussian Mixture (PGM) model to estimate each sample's joint density, enabling efficient density-prioritized buffer selection. Finally, we introduce the streaming Expectation Maximization (EM) algorithm to enhance the adaptability of PGM parameters to streaming data, yielding Streaming PDAC (SPDAC) for streaming scenarios. Extensive comparative experiments show that our methods outperforms other baselines across various CL settings while ensuring favorable efficiency.
Problem

Research questions and friction points this paper is trying to address.

Reducing computational cost in coreset selection for continual learning
Improving memory buffer quality through probability density awareness
Developing efficient streaming coreset selection for dynamic data scenarios
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses probability density for coreset selection
Employs Projected Gaussian Mixture model
Implements streaming EM algorithm adaptation
🔎 Similar Papers
No similar papers found.
Junqi Gao
Junqi Gao
Shanghai AI Lab, 哈尔滨工业大学
Deep LearningGenerative ModelsContinual Learning
Z
Zhichang Guo
School of Mathematics, Harbin Institute of Technology, Harbin, P. R. China
D
Dazhi Zhang
School of Mathematics, Harbin Institute of Technology, Harbin, P. R. China
Y
Yao Li
School of Mathematics, Harbin Institute of Technology, Harbin, P. R. China
Yi Ran
Yi Ran
School of Mathematics, Harbin Institute of Technology, Harbin, P. R. China
B
Biqing Qi
Shanghai Artificial Intelligence Laboratory, Shanghai, P. R. China