Towards Quantifying Long-Range Interactions in Graph Machine Learning: a Large Graph Dataset and a Measurement

📅 2025-03-12

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

Existing graph learning benchmarks predominantly consist of small-scale graphs and emphasize inductive learning, making them inadequate for modeling long-range dependencies; moreover, mainstream models—such as GNNs and Graph Transformers—lack interpretable, quantitative metrics to assess long-range interactions. Method: We introduce (1) City-Networks, a large-scale real-world urban road network dataset (>10⁵ nodes, high diameter), with an eccentricity-labeling task explicitly designed to incentivize long-range modeling; (2) a model-agnostic multi-hop neighbor Jacobian influence metric, enabling the first theoretically grounded, quantitative characterization of long-range information flow in GNNs and revealing over-smoothing and influence dilution mechanisms. Contribution/Results: Empirical evaluation exposes fundamental performance bottlenecks of state-of-the-art GNNs on long-range tasks; we provide the first large-graph transductive benchmark supporting rigorous quantitative analysis of long-range dependency, advancing graph representation learning from local aggregation toward global structural modeling.

Technology Category

Application Category

📝 Abstract

Long-range dependencies are critical for effective graph representation learning, yet most existing datasets focus on small graphs tailored to inductive tasks, offering limited insight into long-range interactions. Current evaluations primarily compare models employing global attention (e.g., graph transformers) with those using local neighborhood aggregation (e.g., message-passing neural networks) without a direct measurement of long-range dependency. In this work, we introduce City-Networks, a novel large-scale transductive learning dataset derived from real-world city roads. This dataset features graphs with over $10^5$ nodes and significantly larger diameters than those in existing benchmarks, naturally embodying long-range information. We annotate the graphs using an eccentricity-based approach, ensuring that the classification task inherently requires information from distant nodes. Furthermore, we propose a model-agnostic measurement based on the Jacobians of neighbors from distant hops, offering a principled quantification of long-range dependencies. Finally, we provide theoretical justifications for both our dataset design and the proposed measurement - particularly by focusing on over-smoothing and influence score dilution - which establishes a robust foundation for further exploration of long-range interactions in graph neural networks.

Problem

Research questions and friction points this paper is trying to address.

Addresses lack of datasets for long-range graph interactions.

Introduces City-Networks, a large-scale dataset for transductive learning.

Proposes a model-agnostic method to measure long-range dependencies.

Innovation

Methods, ideas, or system contributions that make the work stand out.

Introduces City-Networks dataset for transductive learning

Proposes eccentricity-based graph annotation method

Develops model-agnostic long-range dependency measurement

🔎 Similar Papers

No similar papers found.

Authors to Follow