Hidden Sketch: A Space-Efficient Reversible Sketch for Tracking Frequent Items in Data Streams

📅 2025-05-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenge of real-time, accurate tracking of frequent items (heavy hitters and changers) in data streams under resource-constrained settings, this paper proposes a reversible sketch framework that jointly achieves high accuracy and memory efficiency. Unlike conventional sketches, which inherently trade off accuracy for space, our approach introduces the first synergistic integration of a Reversible Bloom Filter (RBF) and a Count-Min Sketch (CM Sketch), enabling lossless joint reconstruction of key-frequency pairs. We provide a rigorous theoretical proof of reversibility and design a compact encoding scheme. Theoretically, the framework attains optimal space complexity—O(1/ε) for ε-approximate frequency estimation. Empirically, it achieves up to 3.2× higher accuracy than state-of-the-art methods under identical memory budgets, supports microsecond-scale per-item updates, and enables millisecond-scale full reconstruction of the entire frequency distribution.

Technology Category

Application Category

📝 Abstract
Modern data stream applications demand memory-efficient solutions for accurately tracking frequent items, such as heavy hitters and heavy changers, under strict resource constraints. Traditional sketches face inherent accuracy-memory trade-offs: they either lose precision to reduce memory usage or inflate memory costs to enable high recording capacity. This paper introduces Hidden Sketch, a space-efficient reversible data structure for key and frequency encoding. Our design uniquely combines a Reversible Bloom Filter (RBF) and a Count-Min (CM) Sketch for invertible key and frequency storage, enabling precise reconstruction for both keys and their frequencies with minimal memory. Theoretical analysis establishes Hidden Sketch's space complexity and guaranteed reversibility, while extensive experiments demonstrate its substantial improvements in accuracy and space efficiency in frequent item tracking tasks. By eliminating the trade-off between reversibility and space efficiency, Hidden Sketch provides a scalable foundation for real-time stream analytics in resource-constrained environments.
Problem

Research questions and friction points this paper is trying to address.

Tracking frequent items in data streams with limited memory
Overcoming accuracy-memory trade-offs in traditional sketch designs
Enabling reversible key and frequency storage with minimal space
Innovation

Methods, ideas, or system contributions that make the work stand out.

Combines Reversible Bloom Filter and Count-Min Sketch
Enables invertible key and frequency storage
Minimizes memory while ensuring reversibility
🔎 Similar Papers
No similar papers found.