🤖 AI Summary
We address node classification on sparse graphs—where expected node degree is $O(1)$—with fixed feature dimension and a large number of nodes. Under an asymptotic regime where the number of nodes tends to infinity, we propose and implement, for the first time, the **asymptotically locally Bayes-optimal classifier**. This classifier is exactly realizable by a message-passing GNN whose architecture continuously interpolates between an MLP (in the low graph signal-to-noise ratio regime) and a GCN (in the high-SNR regime), revealing their fundamental unification. Theoretically, we derive the first non-asymptotic generalization error upper bound; rigorously prove that the proposed GNN achieves the Bayes-optimal error rate; and demonstrate strict improvement over existing methods on analytically tractable SNR models. Our approach integrates statistical-physics-inspired message passing, local tree-expansion analysis, sparse random graph modeling (e.g., degree-corrected stochastic block model), and Bayesian inference—providing both a unified theoretical foundation and practical architectural guidance for learning on sparse graphs.
📝 Abstract
We study the node classification problem on feature-decorated graphs in the sparse setting, i.e., when the expected degree of a node is $O(1)$ in the number of nodes, in the fixed-dimensional asymptotic regime, i.e., the dimension of the feature data is fixed while the number of nodes is large. Such graphs are typically known to be locally tree-like. We introduce a notion of Bayes optimality for node classification tasks, called asymptotic local Bayes optimality, and compute the optimal classifier according to this criterion for a fairly general statistical data model with arbitrary distributions of the node features and edge connectivity. The optimal classifier is implementable using a message-passing graph neural network architecture. We then compute the generalization error of this classifier and compare its performance against existing learning methods theoretically on a well-studied statistical model with naturally identifiable signal-to-noise ratios (SNRs) in the data. We find that the optimal message-passing architecture interpolates between a standard MLP in the regime of low graph signal and a typical convolution in the regime of high graph signal. Furthermore, we prove a corresponding non-asymptotic result.