🤖 AI Summary
This work addresses the challenge of efficiently learning graph Laplacians in distributed streaming settings, where it is difficult to simultaneously achieve computational efficiency, high approximation accuracy, and effective distributed representation. The paper proposes GSQUEAK, the first algorithm that achieves strong spectral approximation guarantees for graph sparsification under a single-pass, distributed streaming model. Leveraging effective-resistance-based sampling, GSQUEAK performs spectral sparsification by maintaining only a small set of critical edges, thereby substantially reducing both computational and communication overhead. The resulting sparse graphs preserve strong spectral similarity to the original graphs while significantly enhancing scalability and real-time performance for large-scale graph learning tasks.
📝 Abstract
Graph-based techniques and spectral graph theory have enriched the field of machine learning with a variety of critical advances. A central object in the analysis is the graph Laplacian L, which encodes the structure of the graph. We consider the problem of learning over this Laplacian in a distributed streaming setting, where new edges of the graph are observed in real time by a network of workers. In this setting, it is hard to learn quickly or approximately while keeping a distributed representation of L. To address this challenge, we present a novel algorithm, GSQUEAK, which efficiently sparsifies the Laplacian by maintaining a small subset of effective resistances. We show that our algorithm produces sparsifiers with strong spectral approximation guarantees, all while processing edges in a single pass and in a distributed fashion.