🤖 AI Summary
This work challenges the prevailing assumption that graph neural networks (GNNs) require trainable neighborhood aggregation to effectively learn node representations. We propose Fixed Aggregation Features (FAFs), a training-free neighborhood aggregation scheme—such as simple averaging—that transforms graph learning tasks into tabular data problems, enabling the use of well-established tabular models like MLPs for efficient and interpretable node representation learning. Grounded in the Kolmogorov–Arnold representation theorem, our theoretical analysis supports this approach. Empirical results across 14 benchmark datasets show that FAF+MLP matches or exceeds the performance of state-of-the-art GNNs and graph Transformers on 12 datasets, underperforming only slightly on Roman Empire and Minesweeper—tasks that demand extremely deep architectures—thereby overturning conventional wisdom regarding the necessity of trainable aggregation.
📝 Abstract
Graph neural networks (GNNs) are widely believed to excel at node representation learning through trainable neighborhood aggregations. We challenge this view by introducing Fixed Aggregation Features (FAFs), a training-free approach that transforms graph learning tasks into tabular problems. This simple shift enables the use of well-established tabular methods, offering strong interpretability and the flexibility to deploy diverse classifiers. Across 14 benchmarks, well-tuned multilayer perceptrons trained on FAFs rival or outperform state-of-the-art GNNs and graph transformers on 12 tasks -- often using only mean aggregation. The only exceptions are the Roman Empire and Minesweeper datasets, which typically require unusually deep GNNs. To explain the theoretical possibility of non-trainable aggregations, we connect our findings to Kolmogorov-Arnold representations and discuss when mean aggregation can be sufficient. In conclusion, our results call for (i) richer benchmarks benefiting from learning diverse neighborhood aggregations, (ii) strong tabular baselines as standard, and (iii) employing and advancing tabular models for graph data to gain new insights into related tasks.