S+t-SNE - Bringing dimensionality reduction to data streams

📅 2024-03-26
🏛️ International Symposium on Intelligent Data Analysis
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of real-time dimensionality reduction and visualization for dynamic data streams, this paper proposes the first strictly single-pass streaming t-SNE algorithm. Methodologically, it introduces a sliding-window optimization framework coupled with a similarity reweighting mechanism, integrating incremental gradient updates, dynamic neighborhood maintenance, adaptive learning rates, and online reconstruction of sparse similarity matrices—balancing local structure preservation and temporal efficiency. The core contribution is the first extension of t-SNE to a rigorously single-pass streaming paradigm, enabling continuous embedding updates and millisecond-level incorporation of new samples. Experiments on diverse real-world data streams demonstrate that the method achieves over 90% of offline t-SNE’s dimensional reduction quality, reduces inference latency by two orders of magnitude, and maintains constant memory footprint—substantially outperforming existing incremental or approximate alternatives.

Technology Category

Application Category

Problem

Research questions and friction points this paper is trying to address.

Data Streaming
Visualization
Real-time Analysis
Innovation

Methods, ideas, or system contributions that make the work stand out.

S+t-SNE
Continuous Data Streaming
Real-time Pattern Discovery
P
Pedro C. Vieira
Department of Computer Science, Faculty of Sciences, University of Porto
J
Joao P. Montrezol
Department of Computer Science, Faculty of Sciences, University of Porto
J
Joao T. Vieira
Department of Computer Science, Faculty of Sciences, University of Porto
Joao Gama
Joao Gama
Professor Emeritus, Faculty of Economics, University of Porto, and INESC TEC
Data MiningMachine LearningData Stream MiningConcept Drift