๐ค AI Summary
To address the scalability bottleneck of Graph Neural Networks (GNNs) in large-scale graph learningโwhere computation and memory costs scale linearly with the number of edges, $O(m)$โthis paper proposes the Intersecting Block Graph (IBG), a low-rank graph approximation framework grounded in overlapping bipartite community structure. We provide the first constructive proof of the weak regularity lemma, establishing that the approximation rank depends only on accuracy, not sparsity. This enables node-level $O(n)$ time and space complexity, breaking the edge-dependent constraint of conventional GNNs. IBG supports efficient, compositional low-rank decomposition over communities and integrates seamlessly into standard GNN architectures. Empirically, IBG achieves state-of-the-art performance on node classification, spatiotemporal graph forecasting, and knowledge graph completion, significantly improving both scalability and predictive accuracy.
๐ Abstract
Learning on large graphs presents significant challenges, with traditional Message Passing Neural Networks suffering from computational and memory costs scaling linearly with the number of edges. We introduce the Intersecting Block Graph (IBG), a low-rank factorization of large directed graphs based on combinations of intersecting bipartite components, each consisting of a pair of communities, for source and target nodes. By giving less weight to non-edges, we show how to efficiently approximate any graph, sparse or dense, by a dense IBG. Specifically, we prove a constructive version of the weak regularity lemma, showing that for any chosen accuracy, every graph, regardless of its size or sparsity, can be approximated by a dense IBG whose rank depends only on the accuracy. This dependence of the rank solely on the accuracy, and not on the sparsity level, is in contrast to previous forms of the weak regularity lemma. We present a graph neural network architecture operating on the IBG representation of the graph and demonstrating competitive performance on node classification, spatio-temporal graph analysis, and knowledge graph completion, while having memory and computational complexity linear in the number of nodes rather than edges.