StreamFP: Learnable Fingerprint-guided Data Selection for Efficient Stream Learning

📅 2024-06-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Stream learning faces the challenge of balancing computational efficiency and model accuracy, particularly under dynamic data distributions and concept drift. This paper proposes a learnable fingerprint-based data selection mechanism that formulates core-set construction as a dynamic parametrization process, enabling online adaptive buffer updates. Our contributions are threefold: (i) the first end-to-end learnable fingerprint embedding representation for streaming data; (ii) a fingerprint-guided coreset optimization framework integrated with gradient-aware sample selection; and (iii) real-time buffer management supporting multi-rate data streams. Extensive experiments on standard stream learning benchmarks demonstrate that our method achieves 15.99%–51.24% higher classification accuracy and 4.6× greater training throughput compared to state-of-the-art approaches, while maintaining low memory overhead and robustness to concept drift.

Technology Category

Application Category

📝 Abstract
Stream Learning (SL) requires models that can quickly adapt to continuously evolving data, posing significant challenges in both computational efficiency and learning accuracy. Effective data selection is critical in SL to ensure a balance between information retention and training efficiency. Traditional rule-based data selection methods struggle to accommodate the dynamic nature of streaming data, highlighting the necessity for innovative solutions that effectively address these challenges. Recent approaches to handling changing data distributions face challenges that limit their effectiveness in fast-paced environments. In response, we propose StreamFP, a novel approach that uniquely employs dynamic, learnable parameters called fingerprints to enhance data selection efficiency and adaptability in stream learning. StreamFP optimizes coreset selection through its unique fingerprint-guided mechanism for efficient training while ensuring robust buffer updates that adaptively respond to data dynamics, setting it apart from existing methods in stream learning. Experimental results demonstrate that StreamFP outperforms state-of-the-art methods by achieving accuracy improvements of 15.99%, 29.65%, and 51.24% compared to baseline models across varying data arrival rates, alongside a training throughput increase of 4.6x.
Problem

Research questions and friction points this paper is trying to address.

Stream Learning
Data Selection
Adaptability
Innovation

Methods, ideas, or system contributions that make the work stand out.

StreamFP
Fingerprint Guided Mechanism
Efficient Learning
🔎 Similar Papers
No similar papers found.
T
Tongjun Shi
National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China; Singapore University of Technology and Design
S
Shuhao Zhang
National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab, Cluster and Grid Computing Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, China
B
Binbin Chen
Singapore University of Technology and Design
B
Bingsheng He
National University of Singapore