🤖 AI Summary
This work investigates whether representation learning is necessary for graph neural networks (GNNs) in the infinite-width limit—specifically, under the Neural Network Gaussian Process (NNGP) regime—and examines its necessity for heterophilic versus homophilic node classification. To this end, we propose the first infinitely wide GNN framework with continuously tunable representation learning strength: the Graph Convolutional Deep Kernel Machine (GC-DKM), which integrates graph convolutional kernels, Gaussian process modeling, and deep kernel machine architecture. Theoretical analysis reveals that representation learning is essential for heterophilic graphs, whereas optimal performance on homophilic node classification can be achieved without explicit representation learning. Empirical results demonstrate that GC-DKM improves accuracy by 15–30% on graph classification and heterophilic node classification benchmarks. These findings establish that representation learning intensity must adapt to graph structural heterogeneity, offering a new paradigm for both theoretical understanding and practical design of infinite-width GNNs.
📝 Abstract
A common theoretical approach to understanding neural networks is to take an infinite-width limit, at which point the outputs become Gaussian process (GP) distributed. This is known as a neural network Gaussian process (NNGP). However, the NNGP kernel is fixed, and tunable only through a small number of hyperparameters, eliminating any possibility of representation learning. This contrasts with finite-width NNs, which are often believed to perform well precisely because they are able to learn representations. Thus in simplifying NNs to make them theoretically tractable, NNGPs may eliminate precisely what makes them work well (representation learning). This motivated us to understand whether representation learning is necessary in a range of graph classification tasks. We develop a precise tool for this task, the graph convolutional deep kernel machine. This is very similar to an NNGP, in that it is an infinite width limit and uses kernels, but comes with a `knob' to control the amount of representation learning. We found that representation learning is necessary (in the sense that it gives dramatic performance improvements) in graph classification tasks and heterophilous node classification tasks, but not in homophilous node classification tasks.