Cardinality is Not Enough: Super Host Detection via Segmented Cardinality Estimation

📅 2026-04-01
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing sketch-based superspreader detection methods rely solely on estimating flow cardinality using full IP addresses, thereby ignoring communication patterns within subnets and resulting in high false-positive rates and low accuracy. Although hierarchical approaches can capture subnet-level cardinalities, they incur prohibitive memory overhead. To address these limitations, this work proposes SegSketch, a novel method that integrates a lightweight semi-segment hashing strategy to infer the length of shared IP prefixes and performs segmented cardinality estimation within subnets, effectively balancing communication locality and memory efficiency. Experimental results demonstrate that, under identical small-memory constraints, SegSketch achieves up to an 8.04× improvement in F1 score compared to state-of-the-art methods, significantly enhancing detection performance.
📝 Abstract
Accurately detecting super host that establishes connections to a large number of distinct peers is significant for mitigating web attacks and ensuring high quality of web service. Existing sketch-based approaches estimate the number of distinct connections called flow cardinality according to full IP addresses, while ignoring the fact that a malicious or victim super host often communicates with hosts within the same subnet, resulting in high false positive rates and low accuracy. Though hierarchical-structure based approaches could capture flow cardinality in subnet, they inherently suffer from high memory usage. To address these limitations, we propose SegSketch, a segmented cardinality estimation approach that employs a lightweight halved-segment hashing strategy to infer common prefix lengths of IP addresses, and estimates cardinality within subnet to enhance detection accuracy under constrained memory size. Experiments driven by real-world traces demonstrate that, SegSketch improves F1-Score by up to 8.04x compared to state-of-the-art solutions, particularly under small memory budgets.
Problem

Research questions and friction points this paper is trying to address.

super host detection
flow cardinality
subnet
false positive rate
memory efficiency
Innovation

Methods, ideas, or system contributions that make the work stand out.

segmented cardinality estimation
super host detection
halved-segment hashing
subnet-aware sketch
flow cardinality
🔎 Similar Papers
No similar papers found.
Y
Yilin Zhao
Central South University
Jiawei Huang
Jiawei Huang
Central South University
data center networkcongestion controlTCP
X
Xianshi Su
Central South University
W
Weihe Li
The University of Edinburgh
X
Xin Li
Central South University
Y
Yan Liu
Central South University
J
Jiacheng Xie
Central South University
Q
Qichen Su
Central South University
J
Jin Ye
Guangxi University
Wanchun Jiang
Wanchun Jiang
Central South University
Compute Network
Jianxin Wang
Jianxin Wang
School of Computer Science and Engineering, Central South university
AlgorithmBioinformaticsComputer Network