🤖 AI Summary
Network devices face severe memory constraints and dynamic resource fluctuations, limiting the deployment of streaming analytics sketches (e.g., Count Sketch, Count-Min Sketch, UnivMon) for global traffic monitoring.
Method: This paper proposes a spatio-temporal sketch decoupling paradigm that dynamically shards sketch instances across time windows and switch nodes, enabling collaborative construction of a global traffic view. It introduces an adaptive time-window partitioning scheme and a space-aware distributed allocation algorithm, supporting fragmented deployment, heterogeneous resource adaptation, and runtime reconfiguration.
Contribution/Results: Experiments demonstrate that, at equal estimation error, memory overhead is reduced by ~75%; under fixed memory budgets, estimation error decreases by nearly one order of magnitude. This work represents the first systematic breakthrough of the single-node sketch deployment bottleneck, significantly improving the accuracy–resource efficiency trade-off for in-network analytics.
📝 Abstract
Streaming analytics are essential in a large range of applications, including databases, networking, and machine learning. To optimize performance, practitioners are increasingly offloading such analytics to network nodes such as switches. However, resources such as fast SRAM memory available at switches are limited, not uniform, and may serve other functionalities as well (e.g., firewall). Moreover, resource availability can also change over time due to the dynamic demands of in-network applications. In this paper, we propose a new approach to disaggregating data structures over time and space, leveraging any residual resource available at network nodes. We focus on sketches, which are fundamental for summarizing data for streaming analytics while providing beneficial space-accuracy tradeoffs. Our idea is to break sketches into multiple `fragments' that are placed at different network nodes. The fragments cover different time periods and are of varying sizes, and are combined to form a network-wide view of the underlying traffic. We apply our solution to three popular sketches (namely, Count Sketch, Count-Min Sketch, and UnivMon) and demonstrate we can achieve approximately a 75% memory size reduction for the same error for many queries, or a near order-of-magnitude error reduction if memory is kept unchanged.