Local Distance-Preserving Node Embeddings and Their Performance on Random Graphs

📅 2025-04-11
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This paper addresses the challenge of jointly preserving local similarity and modeling global distances in graph node embedding. We propose a landmark-based shortest-path approximation method that locally preserves graph distances in low-dimensional space. Theoretically, we prove that for random graphs—including Erdős-Rényi graphs—the required embedding dimension for landmark-based representations is significantly lower than the worst-case bound dictated by Bourgain-type metric embedding theory. Empirically, graph neural networks (GNNs) efficiently learn and generalize pairwise distances between landmarks, achieving high accuracy and strong scalability on large-scale graphs. Our key contributions are twofold: (1) establishing, for the first time, a theoretical connection between random graph structure and optimal embedding dimensionality; and (2) empirically validating the strong generalization capability of GNNs in learning landmark distances. This work introduces a novel, lightweight, and scalable paradigm for graph distance modeling.

Technology Category

Application Category

📝 Abstract
Learning node representations is a fundamental problem in graph machine learning. While existing embedding methods effectively preserve local similarity measures, they often fail to capture global functions like graph distances. Inspired by Bourgain's seminal work on Hilbert space embeddings of metric spaces (1985), we study the performance of local distance-preserving node embeddings. Known as landmark-based algorithms, these embeddings approximate pairwise distances by computing shortest paths from a small subset of reference nodes (i.e., landmarks). Our main theoretical contribution shows that random graphs, such as ErdH{o}s-R'enyi random graphs, require lower dimensions in landmark-based embeddings compared to worst-case graphs. Empirically, we demonstrate that the GNN-based approximations for the distances to landmarks generalize well to larger networks, offering a scalable alternative for graph representation learning.
Problem

Research questions and friction points this paper is trying to address.

Study local distance-preserving node embeddings for graphs
Compare landmark-based embeddings on random vs worst-case graphs
Evaluate GNN-based distance approximations for scalability
Innovation

Methods, ideas, or system contributions that make the work stand out.

Landmark-based node embeddings preserve distances
Lower dimensions needed for random graphs
GNN approximates distances for scalability