🤖 AI Summary
This work addresses the challenge of efficiently analyzing massive graph streams under stringent memory constraints by proposing EdgeSketch, a compact graph stream summary structure that supports single-pass construction. By integrating streaming sketching techniques with node similarity estimation, EdgeSketch enables, for the first time, the direct execution of complex graph algorithms—such as Louvain community detection—on compressed representations. The method provides unbiased estimates of key graph properties with controllable variance. Experimental results demonstrate that EdgeSketch significantly outperforms both conventional lossless storage approaches and existing sketching methods in tasks including community detection and graph reconstruction, achieving substantially reduced memory consumption and faster runtime while maintaining high accuracy.
📝 Abstract
We introduce EdgeSketch, a compact graph representation for efficient analysis of massive graph streams. EdgeSketch provides unbiased estimators for key graph properties with controllable variance and supports implementing graph algorithms on the stored summary directly. It is constructed in a fully streaming manner, requiring a single pass over the edge stream, while offline analysis relies solely on the sketch. We evaluate the proposed approach on two representative applications: community detection via the Louvain method and graph reconstruction through node similarity estimation. Experiments demonstrate substantial memory savings and runtime improvements over both lossless representations and prior sketching approaches, while maintaining reliable accuracy.