🤖 AI Summary
Transductive node classification with graph convolutional networks (GCNs) lacks a statistical foundation for single-graph inference. Method: Leveraging geometric regularity inherent in large-scale graphs, we construct a low-dimensional learnable metric embedding and develop the first concentration inequality tool tailored to a single observed graph—integrating deterministic graph geometry analysis, Erdős–Rényi–type random graph modeling, and GCN theory to characterize intrinsic structural patterns. Contributions: (i) We establish the first rigorous nonparametric learning guarantee for single-graph transductive learning, achieving the optimal convergence rate $O(N^{-1/2})$, thereby breaking the classical paradigm requiring multiple independent graph samples; (ii) our framework unifies treatment of arbitrary deterministic graphs and geometrically structured random graphs, bridging a long-standing theoretical gap; (iii) the statistical guarantees remain valid even under extreme label sparsity—i.e., when only a vanishing number of nodes are labeled.
📝 Abstract
Since their introduction by Kipf and Welling in $2017$, a primary use of graph convolutional networks is transductive node classification, where missing labels are inferred within a single observed graph and its feature matrix. Despite the widespread use of the network model, the statistical foundations of transductive learning remain limited, as standard inference frameworks typically rely on multiple independent samples rather than a single graph. In this work, we address these gaps by developing new concentration-of-measure tools that leverage the geometric regularities of large graphs via low-dimensional metric embeddings. The emergent regularities are captured using a random graph model; however, the methods remain applicable to deterministic graphs once observed. We establish two principal learning results. The first concerns arbitrary deterministic $k$-vertex graphs, and the second addresses random graphs that share key geometric properties with an Erdős-Rényi graph $mathbf{G}=mathbf{G}(k,p)$ in the regime $p in mathcal{O}((log (k)/k)^{1/2})$. The first result serves as the basis for and illuminates the second. We then extend these results to the graph convolutional network setting, where additional challenges arise. Lastly, our learning guarantees remain informative even with a few labelled nodes $N$ and achieve the optimal nonparametric rate $mathcal{O}(N^{-1/2})$ as $N$ grows.