🤖 AI Summary
This paper critically examines the overreliance on the Weisfeiler–Lehman (WL) test in Graph Neural Network (GNN) expressivity research, identifying three fundamental mismatches: semantic (structural equivalence ≠ functional expressivity), feature aggregation capability, and local computability. Method: It introduces the CONGEST model from distributed computing as a rigorous theoretical foundation for GNN expressivity analysis, employing communication complexity to quantify message-passing constraints. Contribution/Results: The analysis proves that simulating one WL iteration requires Ω(n) communication bandwidth—refuting its assumed local computability. It clarifies the actual impact of virtual nodes/edges, corrects widespread misconceptions regarding “pre-computation–enhanced expressivity,” and establishes a more rigorous, computationally grounded, and architecture-aware expressivity framework tailored to practical GNN designs.
📝 Abstract
The success of graph neural networks (GNNs) has spurred theoretical explorations into their expressive power. In the graph machine learning community, researchers often equate GNNs with the Weisfeiler-Lehman (WL) tests as a foundation for theoretical analysis. However, we identify two major limitations of this approach: (1) the semantics of WL tests involve verifying purely structural equivalences through a set of logical sentences. As a result, they do not align well with the concept of expressive power, which is typically defined as the class of functions that GNNs can express, and they are not well-suited for handling graphs with features; (2) by leveraging communication complexity, we show that the lower bound on a GNN's capacity (depth multiplied by width) to simulate one iteration of the WL test grows almost linearly with the graph size. This finding indicates that the WL test is not locally computable and is misaligned with the message-passing GNNs. Furthermore, we show that allowing unlimited precomputation or directly integrating features computed by external models, while claiming that these precomputations enhance the expressiveness of GNNs, can sometimes lead to issues. Such problems can even be observed in an influential paper published in a top-tier machine learning conference. We argue that using well-defined computational models, such as the CONGEST model from distributed computing, is a reasonable approach to characterizing and exploring GNNs' expressive power. Following this approach, we present some results on the effects of virtual nodes and edges. Finally, we highlight several open problems regarding GNN expressive power for further exploration.