Graph Representational Learning: When Does More Expressivity Hurt Generalization?

📅 2025-05-16

📈 Citations: 0

✨ Influential: 0

career value

178K/year

🤖 AI Summary

This paper reveals a nontrivial trade-off between the expressive power and generalization performance of Graph Neural Networks (GNNs): when graph labels are determined by structural features, excessive expressivity harms generalization—unless the training set is sufficiently large or training and test graphs are structurally close. Method: We introduce, for the first time, a family of pre-metrics quantifying structural similarity among graphs, and establish a theoretical framework linking expressivity, structural distance, model complexity, and generalization error—yielding an interpretable, data-dependent generalization upper bound. Contribution/Results: Our theory precisely characterizes the generalization cost of increased expressivity under structural assumptions. Empirical validation—via structure-aware label modeling and extensive experiments—confirms that overly expressive GNNs indeed degrade performance under limited samples or structural distribution shift. Theoretical predictions align closely with empirical observations, providing both explanatory insight and practical guidance for GNN design.

Technology Category

Application Category

📝 Abstract

Graph Neural Networks (GNNs) are powerful tools for learning on structured data, yet the relationship between their expressivity and predictive performance remains unclear. We introduce a family of premetrics that capture different degrees of structural similarity between graphs and relate these similarities to generalization, and consequently, the performance of expressive GNNs. By considering a setting where graph labels are correlated with structural features, we derive generalization bounds that depend on the distance between training and test graphs, model complexity, and training set size. These bounds reveal that more expressive GNNs may generalize worse unless their increased complexity is balanced by a sufficiently large training set or reduced distance between training and test graphs. Our findings relate expressivity and generalization, offering theoretical insights supported by empirical results.

Problem

Research questions and friction points this paper is trying to address.

Explores how GNN expressivity affects generalization performance

Analyzes structural similarity impact on GNN generalization bounds

Examines trade-offs between model complexity and training data size

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introducing premetrics for structural similarity analysis

Deriving generalization bounds for GNN performance

Balancing expressivity with training set size

🔎 Similar Papers

No similar papers found.