LMQ-Sketch: Lagom Multi-Query Sketch for High-Rate Online Analytics

📅 2025-06-20

📈 Citations: 0

✨ Influential: 0

career value

203K/year

🤖 AI Summary

Concurrent online analytics over high-speed data streams—supporting diverse queries (e.g., point queries, F₁/F₂ moment estimation)—remains challenging under heavy update workloads. Method: This paper proposes LMQ-Sketch, the first unified sketch enabling concurrent execution of multiple query types under high-frequency updates. Its core innovation is the “Lagom” mechanism, which jointly guarantees query diversity, strong concurrency semantics (monotonicity and intermediate-value linearizability), and resource efficiency within a single structure. It employs a geometry-guided lightweight synchronization protocol, dynamic load distribution, linearization control, and a composite sketch design. Contribution/Results: LMQ-Sketch achieves <100 μs end-to-end latency, >2 billion updates/s throughput, and 10× lower memory overhead versus state-of-the-art sketches, with theoretically bounded estimation error. Experiments demonstrate significantly higher accuracy at equivalent throughput, enabling real-time, high-throughput streaming analytics.

Technology Category

Application Category

📝 Abstract

Data sketches balance resource efficiency with controllable approximations for extracting features in high-volume, high-rate data. Two important points of interest are highlighted separately in recent works; namely, to (1) answer multiple types of queries from one pass, and (2) query concurrently with updates. Several fundamental challenges arise when integrating these directions, which we tackle in this work. We investigate the trade-offs to be balanced and synthesize key ideas into LMQ-Sketch, a single, composite data sketch supporting multiple queries (frequency point queries, frequency moments F1, and F2) concurrently with updates. Our method'Lagom'is a cornerstone of LMQ-Sketch for low-latency global querying (<100 us), combining freshness, timeliness, and accuracy with a low memory footprint and high throughput (>2B updates/s). We analyze and evaluate the accuracy of Lagom, which builds on a simple geometric argument and efficiently combines work distribution with synchronization for proper concurrency semantics -- monotonicity of operations and intermediate value linearizability. Comparing with state-of-the-art methods (which, as mentioned, only cover either mixed queries or concurrency), LMQ-Sketch shows highly competitive throughput, with additional accuracy guarantees and concurrency semantics, while also reducing the required memory budget by an order of magnitude. We expect the methodology to have broader impact on concurrent multi-query sketches.

Problem

Research questions and friction points this paper is trying to address.

Balancing resource efficiency with approximations in high-rate data analytics

Supporting multiple query types concurrently with data updates

Achieving low-latency global querying with high throughput and accuracy

Innovation

Methods, ideas, or system contributions that make the work stand out.

LMQ-Sketch supports multiple concurrent query types

Lagom ensures low-latency global querying with high throughput

Combines work distribution and synchronization for concurrency semantics

🔎 Similar Papers

Sorting-based FPGA Sliding Window Aggregation Engine without off-chip Memories