🤖 AI Summary
Persistent-sparse (PS) flows—characterized by high persistence yet low traffic density—pose significant challenges for early network threat warning, as they evade detection by conventional heavy-flow or persistent-flow identification methods.
Method: This paper introduces, for the first time, an “anomalous boundary” dual-threshold criterion for precise PS-flow identification. We design a high-accuracy hierarchical Sketch structure featuring dynamic-bit-length counters and cross-layer overflow protection to jointly estimate flow persistence and density, augmented with memory-aware optimizations.
Results: Experiments demonstrate that our approach reduces memory overhead by an order of magnitude, improves F1-score by up to 2.94×, decreases absolute relative error (ARE) by one to two orders of magnitude, and achieves higher throughput than state-of-the-art methods.
📝 Abstract
Finding persistent sparse (PS) flow is critical to early warning of many threats. Previous works have predominantly focused on either heavy or persistent flows, with limited attention given to PS flows. Although some recent studies pay attention to PS flows, they struggle to establish an objective criterion due to insufficient data-driven observations, resulting in reduced accuracy. In this paper, we define a new criterion"anomaly boundary"to distinguish PS flows from regular flows. Specifically, a flow whose persistence exceeds a threshold will be protected, while a protected flow with a density lower than a threshold is reported as a PS flow. We then introduce PSSketch, a high-precision layered sketch to find PS flows. PSSketch employs variable-length bitwise counters, where the first layer tracks the frequency and persistence of all flows, and the second layer protects potential PS flows and records overflow counts from the first layer. Some optimizations have also been implemented to reduce memory consumption further and improve accuracy. The experiments show that PSSketch reduces memory consumption by an order of magnitude compared to the strawman solution combined with existing work. Compared with SOTA solutions for finding PS flows, it outperforms up to 2.94x in F1 score and reduces ARE by 1-2 orders of magnitude. Meanwhile, PSSketch achieves a higher throughput than these solutions.